<<

TWITTER OR FACEBOOK? WHICH DATA IS MOST INDICATIVE FOR MOVIE SALES?

Word count: 11,677

Lore Aelter Student number: 01201548 Michiel Vanluchene Student number: 01206078

Supervisor: Prof. dr. Dirk Van den Poel

Master’s Dissertation submitted to obtain the degree of: Master of Science in Business Engineering

Academic year: 2016 - 2017

TWITTER OR FACEBBOK? WHICH DATA IS MOST INDICATIVE FOR MOVIE SALES?

Word count: 11,677

Lore Aelter Student number: 01201548 Michiel Vanluchene Student number: 01206078

Supervisor: Prof. dr. Dirk Van den Poel

Master’s Dissertation submitted to obtain the degree of: Master of Science in Business Engineering

Academic year: 2016 - 2017

Deze pagina is niet beschikbaar omdat ze persoonsgegevens bevat. Universiteitsbibliotheek Gent, 2021.

This page is not available because it contains personal information. Ghent University, Library, 2021.

ii

Preface This thesis is not only the written version of our research but also the culmination of our training as business engineers. It concludes an intense, educational and wonderful five years at Ghent university. As we both mastered in Operations Management, opting for a master’s thesis in the field of Data Analytics was not straightforward. However, we both wanted to broaden our scope by deepening ourselves in an area we were unfamiliar with.

At the start of this thesis, we want to express our gratitude to the people without whom this project could not have been successful. First and foremost, we want to thank Matthias Bogaert, who guided us every step of the way, and who was always willing to answer our questions and give thorough feedback. We also appreciate the opportunity Prof. Dr. Dirk Van den Poel gave us to elaborate on a subject within the Department of Marketing.

Furthermore, we are grateful to our parents for allowing us to educate ourselves and providing us with a promising future. We also thank them and Olivier for proofreading our report. Next, we thank Sofie for providing us with guidance in the Data Analytics field.

Finally, we want to thank the reader for his or her interest and we hope he or she will enjoy reading our work.

Lore Aelter Michiel Vanluchene

iii

iv

Table of content

Confidentiality agreement ...... i Preface...... iii Table of content ...... v List of used abbreviations ...... vii List of tables ...... viii List of figures ...... ix Samenvatting ...... 1 Abstract ...... 2 1. Introduction ...... 3 2. Literature overview ...... 4 3. Methodology ...... 9 3.1 Data ...... 9 3.2 Variables ...... 10 3.2.1 Model description ...... 10 3.2.2 Dependent variable description ...... 11 3.2.3 Sentiment analysis ...... 11 3.2.4 Predictors ...... 12 3.3 Analytical techniques ...... 15 3.3.1 Linear regression (LR) ...... 15 3.3.2 K-nearest neighbors (KNN) ...... 16 3.3.3 Decision trees (DT) ...... 16 3.3.4 Bagged trees (BT) ...... 17 3.3.5 Random forest (RF) ...... 17 3.3.6 Gradient boosted models (GBM) ...... 18 3.3.7 Neural networks (NN) ...... 18 3.4 Model evaluation criteria ...... 19 3.5 Cross validation ...... 20 3.5.1 Information fusion-based variable importance ...... 21 3.5.2 Partial dependence plots ...... 23 4. Results and discussion ...... 23

v

4.1 Benchmarking algorithms ...... 23 4.2 Model performance comparison ...... 26 4.3 Variable importances and partial dependence plots ...... 28 4.3.1 Information fusion-based variable importances ...... 28 4.3.2 Partial dependence plots ...... 31 5. Conclusion and practical implications ...... 33 6. Limitations and further research ...... 35 References ...... xi Attachments ...... 1-1 Attachment 1 Overview movies ...... 1-1 Attachment 2 Overview and explanation variables ...... 2-1 Attachment 3 Variable importances ...... 3-1 Attachment 4 Partial dependence plot FbTwAll ...... 4-1

vi

List of used abbreviations

API Application programming interface BT Bagged trees CART Classification and regression trees CEB Customer engagement behavior DT Decision trees FbAll Model including all data from Facebook FbPost Model including page info and page-generated data from Facebook FbPostTwTweet: Model including page info and page-generated data from Facebook and Twitter FbTwAll Model including all data from Facebook and Twitter GBM Gradient boosted models KNN K-nearest neighbors Lasso Least absolute shrinkage and selection operator LR Linear regression MAE Mean average error MAPE Mean absolute percentage error NN Neural networks PGC Page-generated content PPI Page popularity indicators RF Random forest RMSE Root mean squared error T Time-based measure TwAll Model including all data from Twitter TwTWeet Model including page info and page-generated data from Twitter U Unbounded measure UGC User-generated content Val Valence variable Vol Volume variable WOM Word-of-mouth

vii

List of tables

Table 1: Overview existing literature using Facebook and Twitter WOM to predict box office performance7 Table 2: Number of observations ...... 10 Table 3: Overview basetables ...... 11 Table 4: Overview variables ...... 13 Table 5: Cross-validated median performance ...... 25 Table 6: Friedman test with Bonferroni-Dunn post hoc test ...... 25 Table 7: Wins-ties-losses ...... 27 Table 8: Variable importances FbAll: top ten ...... 29 Table 9: Variable importances TwAll: top ten ...... 30 Table 10: Variable importances FbTwAll: top ten ...... 31 Table 11: Overview movies included in analysis ...... 1-1 Table 12: Overview and explanation variables FbPost ...... 2-1 Table 13: Overview and explanation additional variables FbAll ...... 2-2 Table 14: Overview and explanation variables TwTweet ...... 2-3 Table 15: Overview and explanation additional variables TwAll ...... 2-4 Table 16: Variable importances FbAll ...... 3-1 Table 17: Variable importances TwAll ...... 3-3 Table 18: Variable importances FbTwAll ...... 3-5

viii

List of figures

Figure 1: Histogram revenue ...... 11 Figure 2: Cross-validated median performance ...... 24 Figure 3: Partial dependence plots ...... 32 Figure 4: Partial dependence plot number of likes ...... 4-1 Figure 5: Partial dependence plot number of negative comments before the release ...... 4-1 Figure 6: Partial dependence plot number of comments two weeks after the release ...... 4-2 Figure 7: Partial dependence plot hype factor comments...... 4-2 Figure 8: Partial dependence plot number of neutral comments 2 weeks after the release ...... 4-3 Figure 9: Partial dependence plot average sentiment score of the comments ...... 4-3 Figure 10: Partial dependence plot average number of times the tweets are favorited ...... 4-4 Figure 11: Partial dependence plot average number of times the replies are retweeted ...... 4-4 Figure 12: Partial dependence plot average length of the posts ...... 4-5 Figure 13: Partial dependence plot number of positive comments after the release ...... 4-5

ix

x

Samenvatting Deze masterproef onderzoekt de invloed van Twitter en Facebook op de verkoop van films en heeft als doel het platform met de grootste voorspellende waarde aan te duiden. Daarenboven worden de belangrijkste pagina populariteit en mond-op-mond reclame indicatoren geïdentificeerd. Op basis van data van 231 films werden 209 variabelen om de verkoop van films te voorspellen. Om het belangrijkste platform en variabelen te ontdekken, werden verschillende modellen gebouwd. Er werden niet enkel modellen op basis van variabelen van slechts één van de platformen opgesteld, maar ook modellen op basis van beide sociale media sites samen. Deze modellen werden twee maal gecreëerd, een keer met en een keer zonder data van door gebruikers gegenereerde inhoud. Vijf keer tweevoudige cross-validatie werd gebruikt om zeven algoritmes (linear regression, k-nearest neighbors, decision trees, bagged trees, random forest, gradient boosted models and neural networks) te benchmarken. Om de prestaties van de verschillende modellen te vergelijken, werden zogenaamde “wins-ties-losses” tabellen opgesteld. De analyse bewijst dat Facebook de belangrijkste databron is en toonde eveneens aan dat de pagina populariteit indicator van Facebook, het aantal vind-ik-leuks op een pagina, de belangrijkste variabele is waarmee men de filmverkopen kan voorspellen. De resultaten demonstreren dat het volume van verkeer op een platform iets meer voorspellende waarde heeft dan de waardering van de film in de online gesprekken. Daarenboven werd opgemerkt dat toevoeging van data die door gebruikers werd gegenereerd niet voor significante verbeteringen zorgt in verklarende waarde. Voor zover wij weten, is dit het eerste onderzoek dat een dergelijke uitgebreide vergelijking maakt tussen twee sociale media platformen, zowel aangaande het aantal methodes als aangaande de uitgebreidheid van de gebruikte variabelen. Onze resultaten tonen duidelijk aan dat Facebook het platform met de beste voorspellende waarde is en dat de focus van een marketing departement best ligt op het bekomen van zo veel mogelijk vind-ik-leuks op de Facebook pagina van de film.

1

Abstract This paper investigates the influence of Twitter and Facebook on box office revenue and aims to discover which platform holds the most predictive power, while also identifying the most important page popularity and word-of-mouth (WOM) indicators. Facebook and Twitter data from 231 movies were obtained and 209 variables in total were used to predict movie sales. To detect the most important platform and variables, multiple models were build. Models including variables from a single source were constructed, as well as models based on both platforms. These models were created twice, once excluding and once including data concerning the comments on Facebook posts and the replies on tweets. Five times two-fold cross-validation was used to benchmark seven algorithms (linear regression, k-nearest neighbors, decision trees, bagged trees, random forest, gradient boosted models and neural networks). Wins-ties-losses tables were created to compare the performance of the different models. The analysis proves Facebook to be the most indicative data source and showed that the page popularity indicator of Facebook, the number of likes on the movie’s Facebook page, proves to be the most important variable when predicting box office revenues. The results also show that volume is slightly more indicative than valence measures and that including data concerning comment- and reply-data on Facebook and Twitter respectively, does not significantly increase the predictive power of the models. To the best of our knowledge, this is the first paper conducting such an extensive comparison between two social media platforms, both in terms of number of algorithms as in terms of the comprehensiveness and completeness of the variable set. Our results clearly show that Facebook holds the most predictive power and that the main focus of a marketing department should be on generating as many likes as possible on the movie’s Facebook page.

2

1. Introduction

Since the rise of social media, substantial research has been devoted to study the relationship between social media and movie sales (Dhar & Chang, 2009). Especially data originating from social media sites such as Yahoo!Movies and Twitter have been used as data input to explain this relationship (Duan, Gu, & Whinston, 2008a; Rui, Liu, & Whinston, 2013). Most of these studies found that online word-of-mouth (WOM) is one of the most important indicators of sales (Chintagunta, Gopinath, & Venkataraman, 2010; Liu, 2006). Even more, Asur & Huberman (2010) concluded in their study that online WOM has more predictive power than other, more traditional data sources such as the Hollywood Stock Exchange index, which is a virtual stock exchange for the entertainment industry. These findings provide important insights for practitioners since it allows them to focus on the most influential elements of online WOM to boost their revenues. For example, research regarding the influence of chatter on Twitter on movie sales has found that the number of tweets and positive tweets ratio are important influencers of box office sales (Rui et al., 2013).

Even though the predictive power of Twitter on movie sales has been studied extensively, less attention has been paid to Facebook. This is a missed opportunity, since Facebook contains a great number of potentially interesting predictors on movie sales (Lo, 2010). Since both Facebook and Twitter generate an excessive amount of data, practitioners need to prioritize one data platform over the other. For example, if millions of tweets and Facebook posts are available for one movie, the extraction of both these data sources takes a very long time. Hence, producers need to prioritize which data they collect to make predictions in a timely manner. Therefore, it is utterly important to know which data source provides the most insight in explaining movie sales. To our knowledge, no extensive study has carried out a comprehensive comparison between Facebook and Twitter on movie sales. Such an analysis could also help a marketing department to focus their efforts on generating buzz on the platform which has the most impact on their revenues.

The contribution of this paper consists of several items. First, we will predict movie sales using data from both Facebook and Twitter over seven algorithms to ensure robust results. This study is the first to conduct a comprehensive and robust comparison between both platforms to determine which holds the most predictive power. Next, we are the first to thoroughly investigate the predictive power of both Facebook and Twitter on box office revenues based on an extensive set of variables. We will elaborate on

3 the most indicative measures from both data sources and discuss their relationship with box office revenues. In order to do so, we scraped the relevant data of 231 movies from their respective Facebook and Twitter accounts.

The remainder of this paper is organized as follows. First, we provide an overview of the existing literature. Second, we will discuss the extracted data, the variables and the used methodology. Third, we will describe our results and practical implications. To conclude, we will elaborate on the limitations of our research and avenues of future research.

2. Literature overview

Research has widely employed WOM data to build predictive models for product sales in various fields, such as the music industry (Dhar & Chang, 2009) and iPod sales (Jin, Gallagher, Cao, Luo, & Han, 2010). This is explained by the fact that WOM data holds more predictive power than other available data sources (Asur & Huberman, 2010; Joshi, Das, Gimpel, & Smith, 2010). However, most research concerning WOM and sales focuses on movie box office predictions. Several reasons make movies a suitable subject of study. First of all, data on performance is easily accessible through sites such as BoxOfficeMojo.com. Additionally, movies are experience goods. People want to estimate the quality of the movie before paying to watch it and rely on other people’s opinions to do so (Huang, Boh, & Goh, 2011).

Most of the past research concerning predicting box office performance using WOM has focused on Yahoo!Movies (Chintagunta et al., 2010; Dellarocas, Zhang, & Awad, 2007; Duan, Gu, & Whinston, 2008b; Y. Liu, 2006). For example, Liu (2006) investigated the relation between WOM on Yahoo!Movies and box office revenues. He found that the chatter on Yahoo!Movies contains the most explanatory power on movie performance in the first weeks after the movie is released. In another study, Duan et al. (2008a) found that the amount of posts on Yahoo!Movies has a significant impact on movie revenues, contrary to the user ratings. Together with the rise of social media, research concerning box office revenues and WOM has shifted from more traditional internet sites (e.g., Yahoo!Movies and blog posts) to social media sites, such as Facebook and Twitter (e.g., Ding, Cheng, Duan, & Jin, 2017; Oh, Roumani, Nwankpa, & Hu, 2016).The reason for this shift is twofold. First, a lot of WOM data is available on these platforms, due to their large user base. Second, Facebook and Twitter contain a select amount of page popularity indicators, which have proven to be good predictors in several studies (Ding et al., 2017; Oh et al., 2016). Since the primary interest of our study is to identify which social media channel is the most indicative for movies

4

sales, we will only focus on studies concerning box office revenues and social media data in the remainder of the literature overview.

Studies on social media data and box office revenues can be categorized according to several dimensions: (1) whether these studies use Facebook or Twitter data, (2) whether the researchers use different algorithms to predict box office revenues, (3) whether multiple types of WOM metrics of a single platform are incorporated and (4) whether Facebook and Twitter are being compared to see which platform has the most predictive power (Table 1). Studies including Twitter seize the predictive power of data generated on this platform to forecast movie sales (Asur & Huberman, 2010; Rui et al., 2013). For example, Rui et al. (2013) used a dynamic panel data model to perform regression-analysis and found that people with more followers on Twitter have a higher influence on movie sales and, moreover, that tweets expressing the intention to watch a movie have the strongest impact. Studies using Facebook data attempt to seize the predictive power of this social medium (Ding et al., 2017; Oh et al., 2016). For example, Oh et al. (2016) found a positive correlation between the number of Facebook likes and box office revenues. Next, studies that compare several algorithms do not use one prediction model to estimate box office revenues, but instead build several models and compare their performance. For example, Liu et al. (2016) not only used linear regression, but also support vector regression with a rational base and linear kernel to predict box office performance based on Twitter data. To predict early box office performance, the linear regression model and the support vector regression with linear kernels outperformed the support vector regression with rational basis function kernels, while the opposite was found for the prediction of the movie’s gross performance. The next dimension to categorize movie sales prediction is whether or not the studies use several types of WOM metrics to predict box office revenues. For example, Asur & Huberman (2010) found that next to the rate of tweets, the sentiment of the tweets was a good predictor of box office revenues. Hence, their study uses both the amount and the sentiment of the tweets and therefore incorporates several metrics to predict movies sales. On the other hand, some studies only incorporated one type of metric to estimate movie sales (Oh et al., 2016). For example, the study of Jain (2013) only used a simple positive versus negative tweets ratio to predict whether a movie would be a hit, would fail or have an average performance. The type of metric was the sentiment of the tweets, while the volume of tweets was not considered.

5

Lastly, studies concerning box office revenue prediction can be categorized based on whether or not they compare Facebook and Twitter to investigate which one holds the most forecasting power. To the best of our knowledge, only the study of Oh et al. (2016) has made preliminary steps in comparing the two social media platforms. Their conclusions state that, most likely, Facebook holds the most explanatory power.

Table 1 summarizes the literature on movie sales and social media data. From Table 1, it is clear that, to our knowledge, no studies have compared social media data sources over multiple variable types and prediction algorithms. The study of Oh et al. (2016) is of special interest since it is the only one that also compares social media data sources.

In their study, Oh et al. (2016) investigate how customer engagement behavior (CEB) on social media is associated with movie performance. They used a robust ordinary least square stepwise regression model and found that Twitter has a significant effect on box office revenue. However, when adding Facebook to the model, the significance of Twitter disappeared. As can be seen in Table 1, the study had some limitations. The research only used one metric to explain the influence of WOM on movie sales for each data source. More specifically, they distinguish between personal and interactive CEB. Personal CEB measures popularity of a page and is concretized by the number of likes on Facebook and the number of followers on Twitter. Interactive CEB measures WOM by the total amount of talk about the movie on Facebook and the total number of tweets on Twitter. However, as seen in other studies, the influence of WOM consists of both volume and valence (Asur & Huberman, 2010). Hence, Oh et al. (2016) only include the volume of WOM but neglect to add the valence of WOM. The second limitation of the study of Oh et al. (2016) is the fact that they only used one algorithm to estimate the influence of the variables. However, in order to find out whether one source of data dominates another one, multiple models need to be built and compared in order to detect significant differences.

We intend to fill these gaps in research by comparing the predictive power of Facebook and Twitter by using several prediction models and including both the volume and valence of WOM. Next to WOM metrics, we will include the page popularity indicators, such as the number of likes for Facebook and the number of followers on Twitter. We use a hybrid data analytical approach consisting of two stages. In the first stage, we build multiple models, using Twitter and Facebook data. We will benchmark several algorithms on these datasets and compare them using five times two-fold cross-validation. In the second stage, we combine the prediction models from the first stage by using information fusion-based variable

6

importances analysis (Oztekin, Delen, Turkyilmaz, & Zaim, 2013). Using the information fusion, we assess variable importances and build partial dependence plots for both data sources. By doing so, we can evaluate the impact of the different social media metrics of Facebook and Twitter data on movie sales.

Table 1: Overview existing literature using Facebook and Twitter WOM to predict box office performance

Compare Use several Compare Authors Platform different WOM data sources algorithms metrics Apala et al. (2013) Twitter Arias, Arratia, & Xuriguera (2014) Twitter X X Asur & Huberman (2010) Twitter X Ding et al. (2017) Facebook El Assady et al. (2013) Twitter X Gaikar, Marakarkandy, & Dasgupta (2015) Twitter X Guàrdia-Sebaoun, Rafrafi, Guigue, & Twitter X Gallinari (2013) Hennig-Thurau, Wiertz, & Feldhaus (2015) Twitter X Jain (2013) Twitter X Kim, Hong, & Kang (2015) Twitter (1) X X Oh et al. (2016) Twitter, X Facebook T. Liu et al. (2016) Twitter X X Rui et al. (2013) Twitter X SivaSantoshReddy, Kasat, & Jain (2012) Twitter X Wong, Sen, & Chiang (2012) Twitter X Our Study Twitter, X X X Facebook

(1) This study is performed with social network service (SNS) data. Data was obtained from pulseK, which aggregates SNS data and performs a sentiment analysis. It was explicitly mentioned that Twitter data is part of this dataset, usage Facebook data is not mentioned.

7

In line with previous research, we expect Facebook to be more indicative for movie sales than Twitter. First, it is the most popular social media platform with more than 1.86 billion active users (Facebook, 2017). Second, Lo (2010) found that Facebook is more relied on as a source of information on movies than specialized sites, such as Yahoo!Movies and boxOfficeMojo.com. Moreover, the study concludes that Facebook is also more important than other popular social network sites such as Twitter and MySpace. Additionally, Oh et al. (2016) found that Twitter has a significant effect on movie performance, when the model uses only Twitter metrics. However, in a model combining Twitter and Facebook data, only Facebook was found to be of significant importance.

In agreement with literature, we believe that Facebook and Twitter predictors related to both page popularity indicators and WOM will be of major importance. Page popularity indicators impact movie box office revenues since they reflect the amount of people that demonstrate interest towards the movie. Showing this interest has a high likelihood of binding the individuals to the movie (Oh et al., 2016). WOM is believed to influence movie-going through two mechanisms. The first is known as the awareness effect, which states that people can only consider products from which they know their existence. As the volume of online chatter increases, the awareness will increase and the movie will become part of the potential customer’s consideration set. The second way WOM influences people is through the persuasive effect, which helps people create their attitude and opinions towards the product through the information they receive about it, which eventually affects their decision to purchase the specific product (Duan et al., 2008a). As the content, or the valence, of the WOM becomes increasingly more positive, the persuasive effect is larger. A lot of research concentrates on these two measures (volume and valence), but not all researchers come to the same conclusions. Some find that both valence and volume matter (Asur & Huberman, 2010; Dellarocas et al., 2007; Onishi & Manchanda, 2012; Rui et al., 2013). Rui et al. (2013) found that both volume and valence of tweets influence movie sales. On the other hand, Liu et al. (2010) predicted the movie box office sales using data from Yahoo!Movies and found volume, measured by the number of posts, to be the most important influencer. No significant effect between valence and sales was found. Chintagunta et al. (2010) measured the impact WOM has on local box office revenue and found valence, and not volume, to be the greatest influencer. However, when they aggregated their data to a nationwide level, they concluded that volume, and not valence, matters.

8

At least two reasons explain the contradictory conclusions of previous research. First, different categories of platforms are used to gather WOM data. The different dynamics on each platform could have an impact on the importance of volume and valence. For example, Yu, Duan, & Cao (2013) found the influence of blogs to be positive and fora to have a negative effect on stock performance, which they related to the fact that overall, blogs have more positive messages and fora more negative ones. The second cause is the broad range of alternatives that are conceived to come up with volume and valence measures. For instance, Asur & Huberman (2010) used a simple positive to negative tweet ratio as a proxy variable for valence, while Liu (2006) employs consumer ratings as a valence indicator.

We are the first in literature to test the influence of both the page popularity indicators and the valence and amount of WOM on movie sales on both social media platforms. To exhaustively measure the effects of social media on box office revenues, we will incorporate all volume and valence metrics that have been constructed in literature, as well as the page popularity indicators. In line with literature, we hypothesize that the amount of page popularity, the amount of WOM and the amount of positive WOM will have a positive influence on box office sales.

In conclusion, we are the first in literature to make an extensive comparison between Facebook and Twitter for movie sales over several prediction algorithms and a comprehensive set of metrics. Moreover, we are also the first to investigate the important variables and their relationship with sales for both data platforms. We hypothesize that the amount and valence of WOM, as well as the page’s popularity, is positively related to box office revenues. Furthermore, we also expect Facebook to be more indicative than Twitter.

3. Methodology

3.1 Data We extracted social media data from 231 movies released between 2012 and 2015 that have both an official Facebook and Twitter page (Attachment 1). As other researchers incorporated less movies, we believe our results are more robust. For example, Rui et al. (2013) used 63 movies and Asur & Huberman (2010) incorporated 24. We obtained data from the timelines of both Facebook and Twitter pages from their very first instance until August 2016. The collected social media data can be categorized into three types: page popularity indicators, page-generated content and user-generated content. Page popularity indicators refer to meta-information about a Facebook or Twitter page such as the page’s likes and the

9 number of followers. Page-generated content contains the Facebook posts or tweets created by the page owners themselves and aggregate information about the interactions on the created content. For example, the total number of likes a Facebook post or the average number of retweets are categorized as page-generated content. User-generated content consists of the Facebook comments and Twitter replies, posted by other users and the interactions on those comments and replies. Page-generated content and user-generated content jointly compose the online conversation, i.e., online WOM. Sales data was collected from within the same time window (Box Office Mojo, 2016).

In order to collect page-related and user-generated content from Facebook, we used the Facebook API (Facebook, 2016). To gather the timeline on Twitter, we also used the official Twitter API (Twitter, 2016). Since the official Twitter API limits the time window from which one is able to gather replies to tweets, we could only extract up to 25% of the total user-generated content. Nevertheless, we believe that the data is representative for all user-generated content and hence can be used to compare the effectiveness of Twitter as a data source. Table 2 summarizes the aggregate number of observations for the page- and user-generated content from the 231 Facebook and Twitter pages. Table 2: Number of observations

Type of content/platform Facebook Twitter Page-generated content 95,725 150,699 User-generated content 2,966,421 190,196

3.2 Variables

3.2.1 Model description Based on the three categories described in the Data section (Section 3.1), we composed six distinct basetables. The basetables enable us to easily compare the predictive power of Facebook and Twitter in box office predictions. We note that the additional computational efforts to incorporate the user- generated content in the analysis are very significant. Therefore, we constructed separate models, one including and one excluding the reply- and comment-variables. By doing so, we are able to investigate whether the additional predictive power of comments and replies justifies the additional computational efforts. Table 3 gives an overview of these six basetables. Additionally, the number of variables included in each model is given.

10

Table 3: Overview basetables

Facebook Twitter # variables Basetable Page popularity Comment Page popularity Post data Tweet data Reply data indicators data indicators FbPost X X 56 FbAll X X X 103 TwTweet X X 58 TwAll X X X 106 FbPostTwTweet X X X X 114 FbTwAll X X X X X X 209

3.2.2 Dependent variable description In accordance with the findings of Pan & Sinha (2010), the distribution of the box office revenue diverges from a normal distribution (Figure 1a) In their study, Pan & Sinha (2010) examined the distribution of the revenue of 5,222 movies. They found that the tail of the distribution of the revenues is much too large to comply with the Gaussian distribution. Their results showed that the revenues follow a log-normal distribution. In line with their study, we apply a logarithmic transformation of the revenue in our models (Figure 1b).

Figure 1: Histogram revenue

3.2.3 Sentiment analysis In order to determine the valence of the data, sentiment analysis was performed. This endeavor starts with preparing the text using text cleaning, which entails transforming the text to lower case, removing punctuation, numbers and web addresses. Afterwards, a lexicon-based method was applied. This method consists of comparing each word in the text-item with a predefined lexicon. This lexicon is a catalogue of

11 words, which associates the words with their inherent happiness-score. If a particular word is located in the lexicon, the matching valence-score from the lexicon is assigned to it (Flanagan, 2015). This valence- score ranges from one to nine, one being a word perceived as very sad, while a score of nine is appointed to a word which reflects happiness. To make the scores more interpretable, the scores were transformed to create scores between -5 and 5, which has the consequence of assigning a value of zero to a neutral word. To deduct the total score of an entire text-item, a naïve Bayesian method is applied (McCallum & Nigam, 1998). The naïve Bayes assumption states that all attributes in a text are independent from each other. Despite this assumption, the method has been proven to perform very good (N. Friedman, Geiger, & Goldszmidt, 1997; McCallum & Nigam, 1998). This lexicon-based method was implemented, as Kalampokis Evangelos, Tambouris Efthimios, & Tarabanis Konstantinos (2013) found that most studies concerning the influence of WOM on social media incorporate this kind of sentiment analysis method. Thus, implementing the naïve Bayes assumption to obtain the valence score of an entire text-item, the valence scores of the separate words in a text are averaged. This way, each text-item is given a score between -5 and 5, -5 being the most negative and +5 the most positive. If no word in the text corresponded with a word in the dictionary, we disregarded this text-item from the sentiment analysis. This could occur for several reasons, the most apparent one being the text-item not being written in English. As we wanted to improve interpretability, we classified all items as being positive, negative or neutral, in accordance with Rui et al. (2013). Text-items with a valence score higher than 0.5 were classified as positive, while text-items with a score lower than -0.5 were categorized as negative. Consequently, text-items that were assigned a score between -0.5 and 0.5 were listed as neutral.

3.2.4 Predictors We opted to include comparable variables for Twitter and Facebook in order to make a fair comparison between both platforms. One can assume that each variable mentioned for one platform has a comparable variable for the other social media site. The aim was to incorporate as many variables as possible, ensuring our ability to determine the most important predictors, while also maximizing the predictive power of our analysis. To achieve this, we included all relevant variables present in the current literature as well as additional combinatorial variables. We note that only variables concerning social media were included, even though other researchers have included variables such as the number of screens and the genre of the movie (Ding et al., 2017; Oh et al., 2016). We did not include these variables, as we are only interested in the predictive power of social

12

media and how they compare to each other. Therefore, we deliberately did not include predictors based on other data sources.

The variables are categorized according to the aforementioned categories: page popularity indicators, page-generated content and user-generated content. Both page- and user-generated content consist of volume and valence metrics. It should be noted that some valence variables also contain an indication of volume. For example, the number of positive posts is classified as a valence measure, because it holds information on the sentiment of the posts, even though this variable also has a volume-component, as it represents the amount of posts that have a positive connotation. The volume and valence category are both further divided in unbounded and time-based variables. Time-based variables take the creation date of the item in relation to the release date of the movie into account. Unbounded measures have no time- frame restriction and cover all non-time-based measures. Table 4 represents these categories, while providing a couple of examples for each category. Attachment 2 gives an overview of all variables. Table 4: Overview variables

The first category is the page popularity indicators, being the number of followers on Twitter, the number of likes on the Twitter page and the number of likes on Facebook (Ding et al., 2017; Oh et al., 2016; Rui et al., 2013). These variables have proven to be good indicators in several studies, as for example Ding et al. (2017) concluded that an increase of 1% in the number of likes a movie’s Facebook page has one week prior to the release results in an increase in box office revenue for the opening weekend with 0,2%. A comparable conclusion was made by Rui et al. (2013), who concluded that having more followers on Twitter increases your influence.

13

The second category contains the page-generated content, which consists of all variables concerning the posts and tweets of a page, posted by the movie’s marketing department. This category is further classified in four subcategories. (1) The unbounded volume variables is the first dimension and describes frequency and other characteristics of post and tweets, over the entire time period. For example, the number of tweets computes the sum of all tweets posted from the creation of the page until August 2016. Another set of variables includes tweet-rates, based on the variables used in the study of Asur & Huberman (2010), who used the number of tweets per hour in their analysis. However, their study only included the first three months after the movie was released. As our data-frame is much larger, we included the average number of tweets per day. (2) The second classification consists of the time-based volume variables. These predictors are based on data within a specified timeframe, relating to volume indicators. This includes, among others, the number of posts before the movie’s release. In our aim to include all relevant variables found in literature, we followed the suggestions of Ding et al. (2017) and Kim et al. (2015), and calculated variables not only before the release and after the release of the movie (Wong et al., 2012), but also one week prior and two weeks after the release. This time period is defined by Asur & Huberman (2010) as the critical period, as promotion reaches its top the week before the release and the hype fades out two weeks after the release. This is shown for example by the variable percentage of posts posted one week before the release. Furthermore, both absolute and relative measures are incorporated (Asur & Huberman, 2010; Hennig-Thurau et al., 2015). Thus, both the number of posts before the release are included, as well as the percentage of the total posts that are posted before the release. (3) Next, the unbounded valence variables consist of variables concerning the sentiment of the data, over the entire time frame. Both measures based on the sentiment scores (e.g., average sentiment score of the posts or tweets) as well as indicators using the classification into positive, neutral and negative tweets (e.g., number of positive posts) were incorporated. Additionally, following Asur & Huberman (2010), we defined a positive/negative ratio, which is defined as the number of positive posts divided by the number of negative posts. In their study, Asur & Huberman (2010) found that adding this measure significantly improved their prediction model. (4) Lastly, the time-based valence variables are analogous with the time-based volume variables, meaning these variables are also based on their relation to the release date. For example, the percentage of tweets that are neutral two weeks after release is a combination of the suggestions of Hennig-Thurau et al. (2015) for calculating measures 2 weeks after the release and of the suggestion of Asur & Huberman (2010) to not only include absolute figures. Relative measures were thus used in addition to absolute measures as the percentage of negative tweets compared to the total amount of tweets can have a different influence than the number of

14

negative tweets in general. Furthermore, the change in sentiment scores prior to and after the release was included in our model, as suggested by Kim et al. (2015). This variable measures the effects of a movie being better or worse than anticipated before its release. The last category is the user-generated content, which consists of variables related to replies on tweets and to the comments on posts. The generation of the variables, both for the volumes variables as for the valence measures is comparable to those for the page-generated content, with a few additions. For example, we calculated the hype factor as proposed by Gaikar et al. (2015), which is the ratio between the number of distinct user and the number of comments or replies. This variable measures how many distinctive users, relative to the amount of comments or replies, have reacted. The closer the hype factor goes to one, the more different users have reacted. If the factor is low (closer to zero), more people are commenting or replying more than once.

3.3 Analytical techniques To be able to make a conclusion regarding the predictive power of the distinctive models, multiple techniques need to be used and compared. To this end, seven algorithms were used to benchmark our six datasets. Each of the algorithms is shortly discussed below.

3.3.1 Linear regression (LR) Linear regression investigates the relative influence of a predictor on an independent variable. Linear regression assumes linear relationships between the independent and the dependent variables and a normal distribution of the variables (Zou, Tuncali, & Silverman, 2003). We apply linear regression with lasso (least absolute shrinkage and selection operator) to control for overfitting and make the results more interpretable. Due to the lasso’s shrinking operation, the total sum of the absolute regression coefficients is restricted to a predefined value. As a consequence, some of the regression coefficients are forced to shrink to zero (Tibshirani, 1995). The goal of the lasso is to solve the following equation (Hastie, Tibshirani, & Friedman, 2009)

푁 푝 푝 1 min{ ∑(푦푖 − 훽0 − ∑ 푥푖푗훽푗)²} 푠푢푏푗푒푐푡 푡표 ∑|훽푗| ≤ 푡 훽0,훽 푁 (1) 푖=1 푗=1 푗=1

In this equation, N is the number of observations (231 in our case), p the number of variables in the model under investigation (Table 3), 푦푖 the outcome for observation i and 푥푖푗 the value of variable j for case i. 훽푗

15 is the parameter that is assigned to variable j. Lastly, t (≥0) is a predefined “tuning” parameter, which is the cause for the shrinking of the coefficients. The smaller this tuning parameter, the stricter the constraint and the more shrinkage will occur (Tibshirani, 1995). To ensure the value of the tuning parameter is optimized, we cross-validated its value in our analysis in terms of RMSE.

3.3.2 K-nearest neighbors (KNN) K-nearest neighbors is a non-parametric pattern recognition method. The initial step of the method is to identify an observation’s k nearest neighbors in a training set. The response variable of the observation is predicted as the average of the responses of these nearest neighbors in the training set. This simple method makes no assumptions concerning the form of the regression function (Altman, 1992; Roy, 2015). Roy (2015) discusses the benefits of using a high k, debating that a large k reduces the variance when a lot of noise is present in data. He also introduces a drawback of inflating k too much, deliberating that smaller trends and patterns might not be recognized. In our analysis, we cross-validated the value of k in terms of RMSE.

3.3.3 Decision trees (DT) The decision tree method is a learning technique and was first introduced by Morgan & Sonquist (1963). To initiate the construction of decision trees, a split is made based on one of the independent variables. To determine which variable is chosen for this first split, the model aims at minimizing the heterogeneity of the values of the outcome measure at each node, which is concretized by a splitting rule (e.g., least squares). After this primary split, the resulting nodes are split again, all following the same splitting rule. The tree is grown further by iterating the steps of variable selection and splitting, until no additional splits can be made or until a stopping rule is satisfied. Next, the mean of the responses of the observations in a terminal node is assigned to that node. In order to make a prediction, an observation passes through the tree until the terminal node is reached, which yields the predicted response (Moisen, 2008; Quinlan, 1987). Following this algorithm, a model, able to predict the value of the dependent variable, is constructed based on multiple variables. The method is able to handle both continuous and discrete variables, while also being robust to outliers. Additionally, the construction and interpretation are rather easy (Moisen, 2008). The method is believed to be accurate and efficient, but has the drawback of being rather unstable and prone to overfitting (Dudoit, Fridlyand, & Speed, 2002). Moreover, as the splitting procedure only takes the next best split into account, a better performance could be obtained by applying a method that

16

evaluates a combination of multiple variables at the point of splitting, as suggested by Dreiseitl & Ohno- Machado (2002). We apply the CART-method (classification and regression trees), as introduced by Breiman, Friedman, Olshen, & Stone (1984), using the binary recursive splitting method. The algorithm is called binary, as two child nodes are created from each parent node. The recursive nature of the method stems from the fact that each child node will become a parent node, unless it is a terminal node (Moisen, 2008).

3.3.4 Bagged trees (BT) Creating bagged trees is applied in order to improve the performance of the decision tree method (Dudoit et al., 2002; Moisen, 2008). The word bagging originates from “bootstrap aggregating” and the method consists of bootstrapping learning sets and then using these as new learning sets. These bootstrap samples are constructed by randomly sampling the data, allowing replacements, which results in a very low chance of having two identical samples. Bagged trees create multiple decision trees, based on the different bootstrap samples. The different trees are subsequently aggregated by averaging the output. When the sample size is large enough, the average of the variances of the different samples converges to the true variance (Boos, 2003). The benefits of bagging are the highest when using an unstable prediction method which tends to overfit, such as the basic decision trees (Breiman, 1996; Moisen, 2008). We follow the suggestions of Breiman et al. (1984) and use CART decision trees as a base classifier.

3.3.5 Random forest (RF) Random forest is an extension of the bagging method for decision trees (Hastie et al., 2009). Next to bagging, random forest uses random feature selection, as introduced by Ho (1998). This entails a random selection of m variables out of the total set of p indicators at each node. Next, the best predictor is chosen, after which next-generation splits are created. This process is repeated until a minimum number of nodes has been created (Hastie et al., 2009; Ho, 1998). Similar to the previous algorithm, the trees are ultimately aggregated by averaging over the forest. The random feature selection decorrelates the different trees that are grown, which results in a lower variance of predictions compared to bagging (Hastie et al., 2009). As Breiman (2001) concluded that using more trees reduces overfitting of the data, we opted for 500 trees in our analysis.

17

3.3.6 Gradient boosted models (GBM) Boosting is a sequential approach in which a weak learning technique is applied repeatedly to a data set, which is updated after each iteration. In each iteration, the weak predictions are used to form a more powerful prediction (Hastie et al., 2009). Gradient boosting starts with a base model which tries to predict the response variable. During the first iteration, a model is fit on the pseudo-residuals from the base model by applying least squares. These pseudo-residuals can be defined as the gradient of the loss function one intends to minimize with regard to the base model’s outcome predictions at each of the training data points. Next, the model obtained in the first iteration is summed with the base model and afterwards, the next iteration is started. The algorithm stops when a specified amount of iterations is completed (Friedman, 2002). The GBM procedure is competitive with other robust and interpretable regression methods (Friedman, 2001) and was proven to have a lower prediction error than bagging methods for regressors (Drucker, 1997).

3.3.7 Neural networks (NN) The algorithm of neural networks is based on the biological working of our brain (Lippmann, 1987). This simple and effective method is non-parametric, which means it does not make assumptions on the underlying distribution of the data (Dreiseitl & Ohno-Machado, 2002; Lippmann, 1987; Venkatesh, Ravi, Prinzie, & Poel, 2014). The algorithm consists of three layers, being the input layer, the hidden layer and the output layer. This hidden layer is applied in order to construct non-linear relations between the input variables (Dreiseitl & Ohno-Machado, 2002; Lek & Guégan, 1999; Venkatesh et al., 2014). To start, random weights are given to the nodes. During the algorithm, these weights are updated in order to improve the model’s performance. The adaptation of these weights has the goal to minimize the specified error function, often the RMSE, which leads to more correct predictions in future responses. (Dreiseitl & Ohno-Machado, 2002; Lek & Guégan, 1999; Venkatesh et al., 2014). An advantage of neural networks is their flexibility. However, this property makes them more prone to overfitting. This effect can be dealt with by applying, among others, weight decay, which means limiting maximum of the allowed weights for the input variables (Dreiseitl & Ohno-Machado, 2002). In order to maximize the performance of the algorithm, we optimize both the size of the network and the weight decay.

18

3.4 Model evaluation criteria We used the root mean squared error (RMSE), the mean average error (MAE), the mean absolute percentage error (MAPE) and R² to benchmark the different models. The MAE measures the average absolute error between the predicted values and the actual values and gives equal weights to all observations (Li & Shi, 2010). The RMSE does not use the absolute value, but instead first calculates an average squared error and then takes the square root of this figure. Large errors are given more weight in the RMSE, which results in superior performance of this measure when large errors should be penalized (Li & Shi, 2010). On the other hand, MAPE calculates the average error between the prediction and the actual value as a percentage (Li & Shi, 2010; Willmott, 1982). Many researchers chose to include the adjusted R² to measure the performance of their predictions of movie box office revenues (Asur & Huberman, 2010; Ding et al., 2017; T. Liu et al., 2016). However, we opted not to include the adjusted R² in favor of a non-adjusted R². Montgomery & Morrison (1973) motivate the use of adjusted R² for measuring the performance of least squares algorithms, since those algorithms will try to find parameters that minimize R² for the sample set. For this reason, adding more variables will lead to overfitting the model, which encourages correcting R² for the amount of variables. Two crucial reasons in favor of using adjusted R², do not hold in our methodology. First, we are not using any least squares algorithms, meaning our algorithms will not try to maximize R². Second, we do not calculate R² on the sample set used to determine the parameters, but on a separate test set. Both of these elements ensure R² is not blindly maximized, which is why we will not adjust this measure. The MAE, RMSE , MAPE and R² are defined as follows

푁 1 푀퐴퐸 = ∑ |푌 − 푌̂ | 푁 푖 푖 (2) 푖=1

∑푁 (푌 − 푌̂ )2 푅푀푆퐸 = √ 푖=1 푖 푖 푁 (3)

푁 1 |푌 − 푌̂ | 푀퐴푃퐸 = ∑ 푖 푖 ∙ 100 푁 푌푖 (4) 푖=1

2 푆푆퐸 2 2 (5) 푅 = 1 − 푤푖푡ℎ 푆푆퐸 = ∑(푌푖 − 푌̂푖) 푎푛푑 푆푆푇 = ∑(푌푖 − 푌̅) 푆푆푇 푖 푖

19

Where N is the number of observations (231), Yi is the actual value for box office performance, Ŷi is the predicted value for the revenues and Y̅ is the mean of the observations. The SSE expresses the sum of squared errors and SST calculates the total sum of squares. The formula given for R² (Equation 5) is the general formula. In predictive analytics, this coefficient of determination is often calculated as the square of the Pearson correlation between the predicted values and their actual values (Zou et al., 2003).

3.5 Cross validation We used a five times two-fold cross validation as proposed by Dietterich (1998). The method starts by randomly splitting the dataset in two equal folds. Each partition gets utilized twice, once to construct a train set and once as a test set (Alpaydin, 1999). The whole procedure is repeated five times in total. This results in ten folds of each of the performance measures per algorithm per model.

To compare the performance of the different algorithms, we used the Friedman test with a Bonferroni- Dunn post-hoc test, which is a non-parametric alternative for the parametric ANOVA test (Conover & Iman, 1981). The Friedman test entails ranking the algorithms for each of the ten folds based on one of the performance measures (M. Friedman, 1937). When algorithms are tied, the average of the ranks they would have gotten if no ties would have existed, is assigned to them. These ranks are then averaged over the ten folds and subsequently tested for significant differences, using a Bonferroni-Dunn post-hoc test (Dunn, 1964). This test allows us to calculate the critical difference, based on a significance level of 0.05. If the difference in average ranks of two algorithms is higher than this critical value, the algorithm with the lowest rank significantly outperforms the algorithm it was compared to.

We use the paired Wilcoxon signed-ranked test to compare the performance of the six models per algorithm in a pair-wise comparison (Demšar, 2006). The Wilcoxon test compares the performance measures of two models. For each fold, the performance gap of a specific algorithm between two models is calculated. Next, the differences are ranked, without taking the sign of these differences into account. When observations are tied, the mean rank value is assigned to them. Afterwards, the rank corresponding to negative differences is turned negative to take this direction into account. Subsequently, the ranks are summed to come up with a test statistic, which has a mean equal to 0 under the null hypothesis. The null hypothesis is rejected when the absolute value of the test statistic is larger than the predefined critical value, which can be consulted in a reference table for different numbers of observations (Wilcoxon, 1945).

20

The non-parametric Wilcoxon test does not impose assumptions involving the normality of distribution and homogeneity of variance, which the parametric t-test does. When these assumptions are met, the t- test has a superior performance over the Wilcoxon test. However, making those assumptions while having multiple datasets is questionable, which explains our choice to use the Wilcoxon test, in accordance with Demšar (2006). Based on the pairwise Wilcoxon tests, wins-ties-losses tables are constructed for each of the performance measures. The number of wins corresponds with the number of algorithms where one model outperforms the other significantly. If no significant difference exists between two models for a particular algorithm, a tie is added. Ultimately, these tables provide an overview of the performance of each model in comparison with the other models.

3.5.1 Information fusion-based variable importance To discover which variables have the most predictive power, we will conduct an information fusion-based variable importance analysis. Information fusion is a technique which combines multiple prediction models into a single model. This model tends to be more accurate and robust than the models it was produced from (Oztekin et al., 2013). If y is the response variable we intend to predict based on indicators

푥1, 푥2, … , 푥푁, by using a prediction model i represented as 푓푖, we are able to represent any model as follows

푦̂푖 = 푓푖(푥1, 푥2, … , 푥푁) (6)

Where 푦̂푖 represents the dependent variable as predicted by model i. The information fusion model will then combine several 푦̂푖’s into a fused prediction, represented by 푦̂푓푢푠푒

푦̂푓푢푠푒 = 휓(푦̂푖) = 휓(푓푖(푥1, 푥2, … , 푥푁)) (7)

Where the fusion function is denoted as 휓. Similar to Sevim, Oztekin, Bali, Gumus, & Guresen (2014), we will use a linear information fusion model, mathematically the previous equation becomes

퐼 푦̂ = ∑ 휔 푓 (푥 , 푥 , … , 푥 ) 푓푢푠푒 푖 푖 1 2 푁 (8) 푖=1

21

Where 휔푖 is equal to the weighting factor of prediction model i. These weights are determined by the accuracy of the model, such that better performing models are assigned higher weights. In our case, we defined 휔푖 using RMSE as follows

1 푅푀푆퐸푖0 휔푖 = (9) 퐼 1 ∑푖=1 푅푀푆퐸푖0

Where 푅푀푆퐸푖0 equals the RMSE of the base model. This term is inversed to reflect that a lower RMSE implies superior performance. For each model, we can determine the importance of each variable j in the model by permuting the variable j with random values. Next, we will predict 푦 with the model in which the variable j is permuted. The extent of the decrease in RMSE indicates the importance of the variable j that was permuted. In mathematical terms

푣푖푗 = 푅푀푆퐸푖푗 − 푅푀푆퐸푖0 (10)

Where 푣푖푗 is the variable importance of variable j in model i, 푅푀푆퐸푖푗 is the RMSE of the model i with variable j permuted and 푅푀푆퐸푖0 the base model. To obtain more robust estimates for the variable importances, using information from multiple models, we can combine the weights of the information fusion model and the variable importance measure(Oztekin et al., 2013)

퐼 푣 = ∑ 휔 푣 푓푢푠푒,푗 푖 푖푗 (11) 푖=1

For each of the folds, this process is repeated. To make the variable importance measure even more robust, the variable j is permuted three times within each fold. The resulting 푣푓푢푠푒,푗’s are averaged to come up with an overall measure of variable importance. This measure represents how much the overall RMSE measure would increase if the variable j would not be included in the model. As such, it indicates how influential variable j is in relation to the independent variable.

22

3.5.2 Partial dependence plots After determining the most important variables, partial dependence plots are built to investigate the relationship between these most indicative variables and the dependent variable. These plots yield a visualization of the relationship between the independent variable and the dependent variable, while disregarding the effect of the other variables by omitting their average effect (Friedman, 2001; Friedman & Meulman, 2003).

To construct the partial dependence plots, we follow the methodology of Berk (2008). Suppose we want to create a partial dependence plot for variable 푥1, which has 푣 distinct values. For each of these 푣 values, a dataset is created, in which 푥1 is held at a constant value within each dataset, while the other variables remain unchanged over all datasets. Next, the response for each observation in the dataset is predicted on the fusion model and averaged over all the observations. This process is repeated for all 푣 values of 푥1.

Finally, the average predictions of each dataset are plotted against the 푣 distinct values of 푥1.

4. Results and discussion

4.1 Benchmarking algorithms The cross-validated results are summarized in Figure 2 for RMSE, MAE, MAPE and R². Table 5 reports the median RMSE, MAE, MAPE and R² of the five times two-fold cross-validation for each model. For each model, the best performance among the seven algorithms was underlined and stressed in bold. A lower value for the RMSE, the MAE and the MAPE is an indication of a better performance, while a lower value for R² implies an inferior performance. Figure 2 and Table 5 suggest that random forest and bagged trees are the top performing algorithms. On the other hand, linear regression and neural network methods are consistently performing worse than other algorithms over all the models.

23

Figure 2: Cross-validated median performance

24

Table 5: Cross-validated median performance

Table 6: Friedman test with Bonferroni-Dunn post hoc test

25

In order to evaluate which of the algorithms differed significantly, the Friedman test with Bonferroni- Dunn post-hoc tests was executed. The results for each data set are given in Table 6. Table 6 should be interpreted as follows; a difference in performance between two algorithms is significant when the difference between the two average ranks, given in the table, is larger than the critical value of 2.548799. The algorithms that are not significantly different from the top performer are underlined and stressed in bold.

The Friedman test shows that there are significant differences in performance between the algorithms, as for each of the models and performance measures the Friedman χ² statistic is rather high and the p-value is lower than 0.001. Random forests and bagged trees are consistently ranked highest for each of the measures. However, the performance of the best performing algorithm is in most cases not significantly higher than gradient boosting and k-nearest neighbors. Decision trees are significantly less accurate than the best performing algorithm in the all of the models in terms of RMSE and R². In terms of MAE and MAPE the difference is only significant for three out of six models. Linear regression and neural networks are significantly less accurate for each model.

4.2 Model performance comparison The results of the comparison between the different models are summarized using the wins-ties-losses (Demšar, 2006). Win-ties-losses are constructed as follows. Each pair of models is compared for each of the seven analytical methods. When one model significantly outperforms the other, the number of wins relative to the other model is increased, otherwise the number of losses is increased. If there is no significant difference, the number of ties rises. To investigate whether the wins and losses were significant, we conducted pair-wise Wilcoxon signed rank tests for each combination. The significant wins, ties and losses are depicted in Table 7, the absolute wins are added in parentheses and the results for the most important conclusions are underlined.

26

Table 7: Wins-ties-losses

27

Table 7 shows that models including Facebook significantly outperform the Twitter models in most cases. The addition of Twitter data to Facebook data entails no significant improvement, as the combined models tie for all seven algorithms for all performance measures with their corresponding model that only includes Facebook data, both for the models excluding user-generated content, as well as for the models containing all data.

Furthermore, adding the user-generated content of Twitter to the model has no effect on the performance as the models tie for each of the seven algorithms for all of performance measures. When comparing the performance of the two models that solely incorporate Facebook variables, it can be detected that, in contrast to the Twitter model, the Facebook model including user-generated content has a better performance for two of the seven algorithms than the model containing only the page popularity indicators and page generated content for the MAE, MAPE and R² measures. The same conclusion can be drawn for the models that combine Facebook and Twitter. These results implicate that adding the user- generated content data would hold additional predictive power. However, it should be noted that the significant differences in performance between these two models are caused by the two worst performing algorithms, i.e., linear regression and neural networks. This leads us to believe that the two wins should be disregarded. We conclude that, similarly to Twitter, adding the user-generated content from Facebook holds no additional indicative power.

4.3 Variable importances and partial dependence plots

4.3.1 Information fusion-based variable importances For the models including all Facebook data, all Twitter data and all data from both platforms, a fusion model based on all seven algorithms was built in order to identify the most influential page popularity indicators and WOM metrics and how the different variable types compare against each other. The variable importances were determined by measuring the mean increase in RMSE in the case these variables would be omitted. Attachment 3 gives an overview of this measure for all variables for these three models, in order of importance.

28

4.3.1.1 Facebook Table 8 shows the ten most indicative variables for the model containing all data from Facebook. By a large margin, the page popularity variable of Facebook (the number of likes) is the most influential. The hype factor of the comments, which reflects whether the comments are created by a small or large group of commenters, is found on the second place (Gaikar et al., 2015). The average length of a Facebook post if found to be very indicative, as suggested by Liu et al. (2010). Among the WOM metrics, the volume measures seem to be more indicative than valence measures, as just three valence measures are among the nine most indicative WOM variables. Next to that, including time-based WOM variables concerning the time period after the movie’ release in the Facebook model is justified as three of the nine best WOM measures are related to this time-bounded dimension.

Table 8: Variable importances FbAll: top ten (PPI= Page popularity indicator, PGC = page-generated content, UGC = user- generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 1 NrOfLikesFb 0.059233572 PPI - - 2 HypeFactorCommentsFb 0.011336754 UGC Vol U 3 NrTalkingAboutFb 0.003150166 PGC Vol U 4 Nr2WeeksAftrRelCommentsFb 0.00310047 UGC Vol T 5 AvgLengthPostFb 0.002899706 PGC Vol U 6 PctAftrRelCommentsFb 0.002748436 UGC Vol T 7 AvgSentScoreCommentsFb 0.002536616 UGC Val U 8 AvgSentAftrRelCommentsFb 0.002302529 UGC Val T 9 AvgNrOfLikesFb 0.002288758 PGC Vol U 10 RatioPosNegCommentsFb 0.002210272 UGC Val U

4.3.1.2 Twitter When looking at the Twitter model (Table 9), one of the page popularity measures (the number of followers) is the most indicative metric, in accordance with the Facebook model. Similarly, volume measures hold more predictive power as only two valence measures are present in the nine most important WOM variables (Liu et al., 2010). The conclusions of Ding et al. (2017) and Kim et al. (2015), stating that including variables one week before and two weeks after the release hold a lot of indicative power, are supported, as four of the nine most important WOM measures are related to this critical period (Asur & Huberman, 2010).

29

Table 9: Variable importances TwAll: top ten (PPI= Page popularity indicator, PGC = page-generated content, UGC = user- generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 1 NrOfFollowersTw 0.038163855 PPI - - 2 AvgNrOfRetweetedRepliesTw 0.005223189 UGC Vol U 3 AvgLengthReplyTw 0.004376805 UGC Vol U 4 AvgNrOfFavoritedTw 0.004024057 PGC Vol U 5 AvgNrOfRepliesTw 0.003775183 UGC Vol U 6 Pct1WeekBfrReleaseTw 0.003005681 PGC Vol T 7 PctNeutral2WeeksAftrRelTw 0.002580824 PGC Val T 8 PctBfrRelRepliesTw 0.00229981 UGC Vol T 9 PctPos2WeeksAftrRelTw 0.001906647 PGC Val T 10 Pct2WeekAftrRelTw 0.001895665 PGC Vol T

4.3.1.3 Combined model The last model includes all variables from both platforms. From the ten most important variables, only two are Twitter-related, supporting our conclusion that Twitter data does not add a significant improvements in predictive power to Facebook data. The most important indicator is the number of likes the movie’s Facebook page has (Table 10). Surprisingly, the number of neutral comments on Facebook two weeks after the release is a highly ranked variable. After investigating the data, we noticed that in this time period, a significant amount of the comments that were classified as neutral, expressed an intention or excitement to watch the movie (e.g., “Can’t wait!”). These were not classified as positive by our algorithm, as the method evaluates each word independently and no unambiguously positive word is found in these kind of intention expressions. However, these intention statements can have a large influence on box office revenues. Previous researchers have mainly investigated the relation between these kind of intention expressions on Twitter and box office revenue. For example, Rui et al. (2013) found that intention tweets had the largest effect in their study. Additionally, Liu et al. (2016) discovered that purchase intention, expressed on social media, leads to more accurate box office success predictions. These conclusions explain the occurrence of these neutral comments in the top ten most important variables. When comparing different variable types, the page popularity indicator of Facebook is the most indicative variable with a clear difference. The page popularity indicators from Twitter are no longer in the top ten, as the best indicator in this category of Twitter, the number of followers, is only found on the 34th place (Attachment 3). This is consistent with the findings of Oh et al. (2016) who stated that adding Facebook likes made the number of Twitter followers insignificant in predicting box office revenue. Concerning the

30

importance of volume and valence, we state that volume is slightly more important, as more volume variables are found in the top best performing variables and as most of the included valence variables also hold a volume-component. Again, the usefulness of time-based measures is confirmed, as three variables in the top five include the time dimension. Both volume and valence variables benefit from this extra dimension.

Table 10: Variable importances FbTwAll: top ten (PPI= Page popularity indicator, PGC = page-generated content, UGC = user- generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 1 NrOfLikesFb 0.062738797 PPI - - 2 NrNegBfrRelCommentsFb 0.003463008 UGC Val T 3 Nr2WeeksAftrRelCommentsFb 0.00339905 UGC Vol T 4 HypeFactorCommentsFb 0.003384417 UGC Vol U 5 NrNeutral2WeeksAftrRelCommentsFb 0.00291628 UGC Val T 6 AvgSentScoreCommentsFb 0.002798468 UGC Val U 7 AvgNrOfFavoritedTw 0.002564835 PGC Vol U 8 AvgNrOfRetweetedRepliesTw 0.001939541 UGC Vol U 9 AvgLengthPostFb 0.001519693 PGC Vol U 10 NrPosAftrRelCommentsFb 0.001493753 UGC Val U

4.3.2 Partial dependence plots Partial dependence plots were constructed for the most important measures of the combined model (Attachment 4). Below, the most prevalent measures and trends of the combined model are depicted and discussed (Figure 3).

Figure 3a shows the partial dependence plot of the number of likes. As expected, the more likes a movie’s Facebook page has, the more success a movie will have (Ding et al., 2017; Oh et al., 2016). The relation in Figure 3b shows a positive relation between the number of negative comments before the movie’s release and its success, which seems counterintuitive at first. A similar relation is observed by Yu et al. (2013), who found a positive relationship between the number of negative tweets and the stock returns. Shao (2012) investigated the relation between negative reviews and box office revenues and found that having controversial reviews spark a conversation, which leads to product sales. This conclusion indicates that controversy, which entails a lot of negative messages, can have a positive effect on box office revenues, explaining the positive relation of number of negative comments.

31

Figure 3: Partial dependence plots

Figure 3c depicts the relation between a movie’s performance and the hype factor. A positive relationship is observed, indicating that having more different people leaving comments leads to an improvement of the box office revenue. Thus, having many people visit a page and just leaving a comment once is more beneficial than having a very engaged smaller fan-base (Gaikar et al., 2015). The plot in Figure 3d shows that the number of neutral comments two weeks before the movie’s release are positively correlated with box office success. This is explained earlier, as much of the comments classified as neutral express an intention to go watch the movie. The studies of Liu et al. (2016) and Rui et al. (2013) investigated these intention remarks on Twitter and proved that they have a positive effect on a movie’s revenues, are thus confirmed and extended to Facebook. Next, a clear positive relation exists between the average sentiment score of the comments and box office revenue (Figure 3e). This validates our hypothesis of a positive relation between the valence of the WOM and box office success. Moreover, the findings of many researchers who stated that positive WOM is an important catalyst for box office performance are confirmed (Chintagunta et al., 2010; Dellarocas et al., 2007).

32

Lastly, Figure 3f proves that having longer comments has a negative effect on box office revenue. This is a contradiction of the findings of Liu et al. (2010), who stated that the length of messages on Yahoo!Movies are positively correlated with the success of a new product. Our findings are, however, replicated in the equivalent measures for Twitter’s tweets. It should be noted that the length of messages on Facebook and certainly on Twitter, is more restricted than on Yahoo!Movies, which was used in the study of Liu et al. (2010). Moreover, the conclusion of Lucas (1934) could be extended to social media. He found that having shorter headlines for advertisements improved the chance of people recalling it, as longer headlines were skipped by hasty readers, explaining the negative relationship between longer posts and box office success.

5. Conclusion and practical implications

We contributed to research in several ways. We are the first to investigate the importance and relation of a comprehensive set of Facebook variables in predicting box office revenues. Similarly, we constructed both a Facebook and Twitter model including all variables used in previous research. We identified the most influential variables and investigated their relationships. Next, we are the first to introduce an extensive comparison of Facebook and Twitter data in their relation to movie sales. Our results are far more robust than previous studies (Asur & Huberman, 2010; Oh et al., 2016; Rui et al., 2013), as we benchmarked several models over multiple algorithms and used significantly more movies to conduct our analysis. Finally, we evaluated seven commonly used prediction methods.

Our analysis showed that linear regression and neural networks are rather poor methods for predicting movie revenues with social media measures. Random forest and bagged trees are consistently observed to be the two best methods. K-nearest neighbours and gradient boosting are found to be less performing than random forest and bagged trees, but these differences are rarely significant. The last method, being decision trees, has better results than linear regression and neural networks, but is not as performant as random forest, bagged trees, k-nearest neighbours and gradient boosting. Therefore, we recommend practitioners to certainly include random forests and bagged trees when predicting box office revenues, while excluding linear regression and neural networks.

Secondly, we found that Facebook data significantly outperformed Twitter data at predicting box office revenues. Moreover, adding Twitter data to a model containing Facebook data does not significantly improve the performance. This conclusion generalizes the findings of Oh et al. (2016), who found that

33 number of followers on Twitter had a significant influence on box office, until the number of Facebook likes were added to the model. However, we implemented a more comprehensive comparison by including all variables used in literature and by using seven algorithms. This leads to more well-grounded conclusions about each platform’s predictive value. The results validates our hypothesis, as it is confirmed that Facebook holds more predictive power on movie sales compared to Twitter. Additionally, the results showed that incorporating user-generated content into the dataset has no significant effect on a model’s performance which does not justify the computational overhead.

Thirdly, the most important variable in our analysis was the page popularity indicator on Facebook, being the number of likes on the page. Among the ten most important measures, eight are found to be Facebook measures, which indicates again that Facebook is a better platform to base predictions on. Additionally, we showed that volume measures are slightly more indicative than valence variables. However, adding valence measures to a model still proves to be useful, especially those containing a volume-component. Lastly, both time-based and unbounded measures are found to be relevant, both for valence and volume variables.

Finally, the partial dependence plots showed mostly the expected relations. However, some interesting relationships came forward. The most contra intuitive observation concerns the fact that a measure containing information about the number of negative instances has a positive relation with the performance. This could be caused by the fact that negative publicity and controversy can motivate people to watch the movie (Shao, 2012). The length of the posts is found to be negatively correlated with the box office revenues, which contradicts the findings of Liu et al. (2010). We noted, however, that this could be related to the fact that messages on Facebook and certainly on Twitter, are generally shorter than those on Yahoo!Movies. Additionally, having longer posts enlarges the chance of people just skipping the post (Lucas, 1934).

Companies aiming to predict box office revenue, while facing limited resources can benefit from our insights in several ways. First and foremost, practitioners should only extract page-generated Facebook content. Extracting user-generated content and Twitter data requires a lot of additional computational effort, while not significantly improving the predictions. Furthermore, we present the most indicative variables to practitioners, allowing them to prioritize the most prevalent measures. Lastly, we encourage

34

the use of random forests and bagged trees in this field. Incorporating these recommendations yields a significant reduction in efforts for practitioners when predicting future performance.

Our research also provides a number of relevant insights for marketing departments. By focussing on the most important factors that influence sales, they can boost revenues. We recommend to focus the marketing efforts on generating a hype on Facebook instead of on Twitter. Moreover, the main focus should not just be on generating Facebook traffic in general, but on boosting the number of likes the page has, as this was found to be the most influential variable. Moreover, generating a larger quantity of posts and comments is more important than focussing on only having positive comments. Next to that, the length of text in posts should be concise, as longer posts have a negative impact.

6. Limitations and further research

The first limitation of our research lies in the data collection phase. Only information on the movie’s Facebook and Twitter pages were extracted, while other content is created elsewhere on the social media sites. For example, people create tweets or posts themselves with no links to the official movie page. Some researchers, such as Rui et al. (2013) have incorporated this data based on hashtags. Gathering the corresponding Facebook data might be a challenging task, since hashtags are not commonly used in Facebook. However, we believe that for the purpose of comparing both platforms predictive performance our data is sufficient. Also from a practical perspective, companies will always start by extracting their own Facebook or Twitter page and the comments and replies to those pages. Only in a later stage, when performance is not sufficient, they will start with extracting the related hashtags. However, we have shown that page-related data already achieve reasonable predictive performance. An additional limitation originating from the data collection phase, is the inability of our algorithm to include multimedia used in page-generated item, such as photos and videos. This multimedia data can hold significant predictive power, as Jin et al. (2010) proved by estimating quarterly iPod sales based on the image sharing site Flickr. Subsequently, an opportunity might exist in applying multimedia learning algorithms to enhance movie box office forecasts.

The second limitation in our research concerns our method of sentiment analysis. Dadvar, Hauff, & de Jong (2011) found that many negation-related sentences are wrongly classified by more traditional methods, such as the naïve Bayesian method we employed. Moreover, it was noticed that many intention

35 text-items were classified as being neutral, a more sophisticated sentiment analysis could seek to identify those intention text-items. Additionally, Hogenboom et al. (2013) improved their classification accuracy by including emoticons in their study. We chose to use a naïve Bayesian method, as this method is easily implemented, still gives a good performance and is used in many studies relating movie sale predictions (Kalampokis Evangelos et al., 2013).

Lastly, we only included social media measures as this was the focal point of our study. The predictive power of our models could have been increased by adding other sources of data, such as the number of screens and the star power of the actors (Ding et al. 2017; Oh et al. 2016). As we were only interested in comparing the predictive power of Twitter and Facebook, we opted to only include variables related to these social media platforms.

36

References Alpaydin, E. (1999). Combined 5 × 2 cv F Test for Comparing Supervised Classification Learning Algorithms. Neural Computation, 11(8), 1885–1892. Altman, N. (1992). An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician, 46(3), 175–185. Apala, K. R., Jose, M., Motnam, S., Chan, C.-C., Liszka, K. J., & de Gregorio, F. (2013). Prediction of Movies Box Office Performance Using Social Media. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (pp. 1209–1214). New York, NY, USA: ACM. Arias, M., Arratia, A., & Xuriguera, R. (2014). Forecasting with Twitter Data. ACM Trans. Intell. Syst. Technol., 5(1), 8:1–8:24. Asur, S., & Huberman, B. A. (2010). Predicting the Future with Social Media. In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) (Vol. 1, pp. 492–499). Berk, R. A. (2008). Statistical learning from a regression perspective. Boos, D. D. (2003). Introduction to the Bootstrap World. Statistical Science, 18(2), 168–174. Box Office Mojo. (2016). Retrieved April 17, 2017, from http://www.boxofficemojo.com/ Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification And Regression Trees. USA: Wadsworth & Brookes/Cole Advanced Books & Software. Chintagunta, P. K., Gopinath, S., & Venkataraman, S. (2010). The Effects of Online User Reviews on Movie Box-Office Performance: Accounting for Sequential Rollout and Aggregation Across Local Markets. SSRN Electronic Journal. Retrieved from http://www.ssrn.com/abstract=1331124 Conover, W. J., & Iman, R. L. (1981). Rank Transformations as a Bridge between Parametric and Nonparametric Statistics. The American Statistician, 35(3), 124–129. Dadvar, M., Hauff, C., & de Jong, F. M. G. (2011, February 4). Scope of negation detection in sentiment analysis [Conference or Workshop Paper]. Retrieved May 16, 2017, from http://eprints.eemcs.utwente.nl/19631/ Dellarocas, C., Zhang, X. (Michael), & Awad, N. F. (2007). Exploring the value of online product reviews in forecasting sales: The case of motion pictures. Journal of Interactive Marketing, 21(4), 23–45.

xi

Demšar, J. (2006). Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 7(Jan), 1–30. Dhar, V., & Chang, E. A. (2009). Does Chatter Matter? The Impact of User-Generated Content on Music Sales. Journal of Interactive Marketing, 23(4), 300–307. Dietterich, T. G. (1998). Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10(7), 1895–1923. Ding, C., Cheng, H. K., Duan, Y., & Jin, Y. (2017). The power of the “like” button: The impact of social media on box office. Decision Support Systems, 94, 77–84. Dreiseitl, S., & Ohno-Machado, L. (2002). Logistic regression and artificial neural network classification models: a methodology review. Journal of Biomedical Informatics, 35(5–6), 352–359. Drucker, H. (1997). Improving Regressors using Boosting Techniques” (Vol. 107–115). Presented at the the Fourteenth International Conference on Machine Learning. Duan, W., Gu, B., & Whinston, A. B. (2008a). Do online reviews matter? — An empirical investigation of panel data. Decision Support Systems, 45(4), 1007–1016. Duan, W., Gu, B., & Whinston, A. B. (2008b). The dynamics of online word-of-mouth and product sales— An empirical investigation of the movie industry. Journal of Retailing, 84(2), 233–242. Dudoit, S., Fridlyand, J., & Speed, T. P. (2002). Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association, 97(457), 77–87. Dunn, O. J. (1964). Multiple Comparisons Using Rank Sums. Technometrics, 6(3), 241–252. El Assady, M., Hafner, D., Hund, M., Jӓger, A., Jentner, W., Rohrdantz, C., … Keim, D. (2013). Visual Analytics for the Prediction of Movie Rating and Box Office Performance. Presented at the VIS. Retrieved from https://kops.uni-konstanz.de/handle/123456789/26533 Facebook. (2016). Facebook for Developers. Retrieved August 1, 2016, from https://developers.facebook.com/ Facebook. (2017). Company Info | Facebook Newsroom. Retrieved February 23, 2017, from http://newsroom.fb.com/company-info/ Flanagan, E. (2015). Lexicon-Based Sentiment Analysis. Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29(5), 1189–1232. Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367– 378.

xii

Friedman, J. H., & Meulman, J. J. (2003). Multiple additive regression trees with application in epidemiology. Statistics in Medicine, 22(9), 1365–1381. Friedman, M. (1937). The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance. Journal of the American Statistical Association, 32(200), 675–701. Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian Network Classifiers. Machine Learning, 29(2– 3), 131–163. Gaikar, D. D., Marakarkandy, B., & Dasgupta, C. (2015). Using Twitter data to predict the performance of Bollywood movies. Industrial Management & Data Systems, 115(9), 1604–1621. Guàrdia-Sebaoun, É., Rafrafi, A., Guigue, V., & Gallinari, P. (2013). Cross-media Sentiment Classification and Application to Box-office Forecasting. In Proceedings of the 10th Conference on Open Research Areas in Information Retrieval (pp. 201–208). , France, France: LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE. Retrieved from http://dl.acm.org/citation.cfm?id=2491748.2491789 Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The Elements Of Statistical Learning (2nd ed.). Springer. Hennig-Thurau, T., Wiertz, C., & Feldhaus, F. (2015). Does Twitter matter? The impact of microblogging word of mouth on consumers’ adoption of new movies. Journal of the Academy of Marketing Science, 43(3), 375–394. Ho, T. K. (1998). The Random Subspace Method for Constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell., 20(8), 832–844. Hogenboom, A., Bal, D., Frasincar, F., Bal, M., de Jong, F., & Kaymak, U. (2013). Exploiting Emoticons in Sentiment Analysis. In Proceedings of the 28th Annual ACM Symposium on Applied Computing (pp. 703–710). New York, NY, USA: ACM. Huang, J., Boh, W. F., & Goh, K. H. (2011). From A Social Influence Perspective: The Impact Of Social Media On Movie Sales. PACIS 2011 Proceedings. Retrieved from http://aisel.aisnet.org/pacis2011/79 Jain, V. (2013). Prediction of Movie Success using Sentiment A nalysis of Tweets. The International Journal of Soft Computing and Software Engineering [JSCSE], 308–313. Jin, X., Gallagher, A., Cao, L., Luo, J., & Han, J. (2010). The Wisdom of Social Multimedia: Using Flickr for Prediction and Forecast. In Proceedings of the 18th ACM International Conference on Multimedia (pp. 1235–1244). New York, NY, USA: ACM. Joshi, M., Das, D., Gimpel, K., & Smith, N. A. (2010). Movie Reviews and Revenues: An Experiment in Text Regression. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 293–296). Stroudsburg,

xiii

PA, USA: Association for Computational Linguistics. Retrieved from http://dl.acm.org/citation.cfm?id=1857999.1858037 Kalampokis Evangelos, Tambouris Efthimios, & Tarabanis Konstantinos. (2013). Understanding the predictive power of social media. Internet Research, 23(5), 544–559. Kim, T., Hong, J., & Kang, P. (2015). Box office forecasting using machine learning algorithms based on SNS data. International Journal of Forecasting, 31(2), 364–390. Lek, S., & Guégan, J. F. (1999). Artificial neural networks as a tool in ecological modelling, an introduction. Ecological Modelling, 120. Retrieved from https://www.mysciencework.com/publication/show/f4ba05ddfdce5cf9424ba30bd343a1fe Li, G., & Shi, J. (2010). On comparing three artificial neural networks for wind speed forecasting. Applied Energy, 87(7), 2313–2320. Lippmann, R. (1987). An introduction to computing with neural nets. IEEE ASSP Magazine, 4(2), 4–22. Liu, T., Ding, X., Chen, Y., Chen, H., & Guo, M. (2016). Predicting movie Box-office revenues by exploiting large-scale social media content. Multimedia Tools and Applications, 75(3), 1509–1528. Liu, Y. (2006). Word of Mouth for Movies: Its Dynamics and Impact on Box Office Revenue. Journal of Marketing, 70(3), 74–89. Liu, Y., Chen, Y., Lusch, R., Chen, H., Zimbra, D., & Zeng, S. (2010). User-Generated Content on Social Media: Predicting Market Success with Online Word-of-Mouth (SSRN Scholarly Paper No. ID 2655800). Rochester, NY: Social Science Research Network. Retrieved from http://papers.ssrn.com/abstract=2655800 Lo, J.-P. (2010, August 31). The effectiveness of WOM by using Facebook as an implementation in movie industry (Thesis). Retrieved from http://scholarworks.calstate.edu/handle/10211.9/599 Lucas, D. B. (1934). The optimum length of advertising. Journal of Applied Psychology, 18(5), 665–674. McCallum, A., & Nigam, K. (1998). A comparison of event models for Naive Bayes text classification. Moisen, G. G. (2008). Classification and Regression Trees. MyScienceWork. Retrieved from https://www.mysciencework.com/publication/show/f970550356e7d4cf0dc4610878086eed Montgomery, D. B., & Morrison, D. G. (1973). A Note on Adjusting R2. The Journal of Finance, 28(4), 1009– 1013. Morgan, J. N., & Sonquist, J. A. (1963). Problems in the Analysis of Survey Data, and a Proposal. Journal of the American Statistical Association, 58(302), 415–434.

xiv

Oh, C., Roumani, Y., Nwankpa, J. K., & Hu, H.-F. (2016). Beyond likes and tweets: Consumer engagement behavior and movie box office in social media. Information & Management. Retrieved from http://www.sciencedirect.com/science/article/pii/S0378720616300271 Onishi, H., & Manchanda, P. (2012). Marketing activity, blogging and sales. International Journal of Research in Marketing, 29(3), 221–234. Oztekin, A., Delen, D., Turkyilmaz, A., & Zaim, S. (2013). A Machine Learning-based Usability Evaluation Method for eLearning Systems. Decis. Support Syst., 56(C), 63–73. Pan, R. K., & Sinha, S. (2010). The statistical laws of popularity: Universal properties of the box office dynamics of motion pictures. ArXiv E-Prints, 1010. Quinlan, J. R. (1987). Simplifying decision trees. International Journal of Man-Machine Studies, 27(3), 221– 234. Roy, P. (2015, August 19). Best way to learn kNN Algorithm in R Programming. Retrieved May 4, 2017, from https://www.analyticsvidhya.com/blog/2015/08/learning-concept-knn-algorithms- programming/ Rui, H., Liu, Y., & Whinston, A. (2013). Whose and what chatter matters? The effect of tweets on movie sales. Decision Support Systems, 55(4), 863–870. Sevim, C., Oztekin, A., Bali, O., Gumus, S., & Guresen, E. (2014). Developing an early warning system to predict currency crises. European Journal of Operational Research, 237(3), 1095–1104. Shao, K. (2012). The Effects of Controversial Reviews on Product Sales Performance: The Mediating Role of the Volume of Word of Mouth. International Journal of Marketing Studies, 4(4), 32. SivaSantoshReddy, A., Kasat, P., & Jain, A. (2012). Box-Office Opening Prediction of Movies based on Hype Analysis through Data Mining. International Journal of Computer Applications, 56, 1–5. Tibshirani, R. (1995). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society., (58), 267–288. Twitter. (2016). Twitter Developers. Retrieved August 1, 2016, from https://dev.twitter.com/ Venkatesh, K., Ravi, V., Prinzie, A., & Poel, D. V. den. (2014). Cash demand forecasting in ATMs by clustering and neural networks. European Journal of Operational Research, 232(2), 383–392. Wilcoxon, F. (1945). Individual Comparisons by Ranking Methods. Biometrics Bulletin, 1(6), 80–83. Willmott, C. J. (1982). Some Comments on the Evaluation of Model Performance. Bulletin of the American Meteorological Society, 63(11), 1309–1313.

xv

Wong, F. M. F., Sen, S., & Chiang, M. (2012). Why Watching Movie Tweets Won’T Tell the Whole Story? In Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks (pp. 61–66). New York, NY, USA: ACM. Yu, Y., Duan, W., & Cao, Q. (2013). The impact of social and conventional media on firm equity value: A sentiment analysis approach. Decision Support Systems, 55(4), 919–926. Zou, K. H., Tuncali, K., & Silverman, S. G. (2003). Correlation and simple linear regression. Radiology, 227(3), 617–622.

xvi

Attachments

Attachment 1 Overview movies

Table 11: Overview movies included in analysis

Movies included in analysis 12 O'Clock Boys Difret Maleficent The Bronze 2 Guns Django Unchained Men, Women and Children The Collection 23 Blast Do You Believe Mistaken for Strangers The Dark Horse 3 Days to Kill Dom Hemingway Moms' Night Out The Duff 45 Years Don Jon Mr. Holmes The End of the Tour 47 Ronin Dope Mr. Turner The Equalizer 5 to 7 Draft Day My All-American The Fifth Estate 50 to 1 Earth to Echo Nightcrawler The Gallows 99 Homes Edge of Tomorrow No Escape The Gift A Most Violent Year Elysium Non-Stop The Good Dinosaur A Most Wanted Man Escape from Tomorrow November Man The Great Gatsby A Royal Night Out Escape Plan Oblivion The Green Inferno A Walk Among the Tombstones Everest Obvious Child The Guilt Trip A Walk in the Woods Ex Machina Ouija The Gunman About Time Faith of Our Fathers Out of the Furnace The Host Addicted Fantastic Four Pan The Hundred-Foot Journey After Earth Finding Vivian Maier Pandora's Promise The Intern After Tiller Free Birds Paper Towns The Internship Alone Yet Not Alone Fury ParaNorman The Lazarus Effect American Hustle Get on Up Parker The Letters American Ultra Getaway Particle Fever The Lone Ranger Anomalisa Gimme Shelter People Like Us The Longest Ride Ant-Man God Help the Girl Pixels The Lorax Battle of the Year Gone Girl Playing for Keeps The Man from U.N.C.L.E. Battleship Guardians of the Galaxy Point Break The Man in 3B Beasts of the Southern Wild Here Comes the Boom Poltergeist The Martian Beautiful Creatures Hit and Run Project Almanac The Nut Job Before I Go to Sleep Homefront Red Dawn The One I Love Beyond the Lights Hot Pursuit Ricki and the Flash The Other Woman Big Men Hotel Transylvania RoboCop The Peanuts Movie Black or White I Smile Back Rock of Ages The Punk Singer Blackhat Identity Thief Rock the Kasbah The Revenant Blended In Secret Room The Rover Blood Ties Inside Out Rosewater The Secret Life of Walter Mitty Blue Caprice Interstellar Run All Night The Short Game Boyhood Into the Woods Runner Runner The Skeleton Twins Brave Irrational Man Safe Haven The Song Brick Mansions It Follows Selfless The Vow Bridge of Spies Jem and the Holograms Selma The Walk By the Sea Jobs Seventh Son The Wedding Ringer Calvary John Carter Sicario The Witch Captive John Wick Side Effects The Wolverine Carrie Joy Snake and Mongoose Think Like a Man Too Chappie Kids for Cash Snowpiercer Top Five Chernobyl Diaries Knight of Cups Son of God Unbroken Christian Mingle Krampus Southpaw Under the Skin Computer Chess Laggies Spotlight Unfinished Business Concussion Last Vegas Spring Breakers Upstream Color Contraband Learning to Drive Spy Victor Frankenstein Cooties Let's Be Cops Steve Jobs Warm Bodies Cop Car Life After Beth Strange Magic When the Game Stands Tall Copperhead Lincoln The 33 Where Hope Grows Creed Little Boy The Age of Adaline While We're Young Crimson Peak Looper The Babadook White House Down Daddy's Home Love and Mercy The Best of Me Wish I Was Here Dark Skies Love the Coopers The Big Short Wreck-It Ralph Dead Man Down Lucy The Book Thief Zero Dark Thirty Delivery Man Maggie The Boxtrolls

1-1

1-2

Attachment 2 Overview and explanation variables

Table 12: Overview and explanation variables FbPost (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name Variable explanation PPI/PGC/UGC Vol/Val U/T 1 NrOfLikesFb Number of likes on Facebook page PPI - - 2 NrOfPostsFb Number of posts on Facebook page PGC Vol U 3 NrTalkingAboutFb Number of talking about a Facebook page PGC Vol U 4 AvgNrOfLikesFb Average number of likes on a post PGC Vol U 5 AvgLengthPostFb Average number of characters of a post PGC Vol U 6 AvgNrOfDaysBetweenPostsFb Average number of days between 2 posts PGC Vol T 7 NrBeforeReleaseFb Number of posts posted before the release PGC Vol T 8 NrAFterReleaseFb Number of posts posted after the release PGC Vol T 9 PctBeforeReleaseFb Percentage of the posts posted before the release PGC Vol T 10 PctAfterReleaseFb Percentage of the posts posted after the release PGC Vol T 11 Nr1WeekBfrRelFb Number of posts posted 1 week before the release PGC Vol T 12 Pct1WeekBfrReleaseFb Percentage of posts posted 1 week before the release PGC Vol T 13 Nr2WeekAftrRelFb Number of posts posted 2 weeks after the release PGC Vol T 14 Pct2WeeksAftrReleaseFb Percentage of posts posted 2 weeks after the release PGC Vol T 15 NrPosFb Number of positive posts PGC Val U 16 NrNegFb Number of negative posts PGC Val U 17 NrNeutralFb Number of neutral posts PGC Val U 18 RatioPosNegFb Ratio positive versus negative posts PGC Val U 19 PctPosFb Percentage of posts being positive PGC Val U 20 PctNegFb Percentage of posts being negative PGC Val U 21 PctNeutralFb Percentage of posts being neutral PGC Val U 22 NrPosBfrRelFb Number of positive posts before the release PGC Val T 23 NrNegBfrRelFb Number of negative posts before the release PGC Val T 24 NrNeutralBfrRelFb Number of neutral posts before the release PGC Val T 25 RatioPosNegBfrRelFb Ratio positive versus negative posts before the release PGC Val T 26 PctPosBfrRelFb Percentage of posts before release being positive PGC Val T 27 PctNegBfrRelFb Percentage of posts before release being negative PGC Val T 28 PctNeutralBfrRelFb Percentage of posts before release being neutral PGC Val T 29 NrPosAftrRelFb Number of positive posts after the release PGC Val T 30 NrNegAftrRelFb Number of negative posts after the release PGC Val T 31 NrNeutralAftrRelFb Number of neutral posts after the release PGC Val T 32 RatioPosNegAftrRelFb Ratio of positive versus negative posts after the release PGC Val T 33 PctPosAftrRelFb Percentage of posts after the release being positive PGC Val T 34 PctNegAftrRelFb Percentage of posts after the release being negative PGC Val T 35 PctNeutralAftrRelFb Percentage of posts after the release being neutral PGC Val T 36 NrPos1WeekBfrRelFb Number of positive posts 1 week before the release PGC Val T 37 NrNeg1WeekBfrRelFb Number of negative posts 1 week before the release PGC Val T 38 NrNeutral1WeekBfrRelFb Number of neutral posts 1 week before the release PGC Val T 39 NrPos2WeeksAftrRelFb Number of positive posts 2 weeks after the release PGC Val T 40 NrNeg2WeeksAftrRelFb Number of negative posts 2 weeks after the release PGC Val T 41 NrNeutral2WeeksAftrRelFb Number of neutral posts 2 weeks after the release PGC Val T 42 RatioPosNeg2WeeksAftrRelFb Ratio positive versus negative posts 2 weeks after the release PGC Val T 43 PctPos2WeeksAftrRelFb Percentage of posts 2 weeks after the release being positive PGC Val T 44 PctNeg2WeeksAftrRelFb Percentage of posts 2 weeks after the release being negative PGC Val T 45 PctNeutral2WeeksAftrRelFb Percentage of posts 2 weeks after the release being neutral PGC Val T 46 AvgSentScoreFb Average sentiment score of the posts PGC Val U 47 AvgSentBfrRelFb Average sentiment score of posts posted before the release PGC Val T 48 AvgSentAftrRelFb Average sentiment score of posts posted after the release PGC Val T 49 AvgSent2WeeksAftrRelFb Average sentiment score of posts posted 2 weeks after the release PGC Val T 50 ChangeSentPrePostRelFb Change in sentiment score of the posts before and after the release PGC Val T 51 PctChangeSentPrePostRelFb Percentage change in sentiment score of the posts before and after the release PGC Val T 52 AvgNrOfPostsPerDayPostsFb Average number of posts per day PGC Vol T 53 AvgNrOfPostsPerDayBfrRelPostsFb Average number of posts per day before the release PGC Vol T 54 AvgNrOfPostsPerDayAftrRelPostsFb Average number of posts per day after the release PGC Vol T 55 AvgNrOfPostsPerDay1WeekBfrRelPostsFb Average number of posts per day 1 week before the release PGC Vol T 56 AvgNrOfPostsPerDay2WeeksAftrRelPostsFb Average number of posts per day 2 weeks after the release PGC Vol T

2-1

Table 13: Overview and explanation additional variables FbAll (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name Variable explanation PPI/PGC/UGC Vol/Val U/T 57 NrofCommentsFb Number of comments UGC Vol U 58 AvgNrLikesCommentFb Average number of likes on the comments UGC Vol U 59 AvgNrOfCommentsFb Average number of comments per post UGC Vol U 60 AvgLengthCommentFb Average number of characters per comment UGC Vol U 61 AvgNrOfDaysBetweenCommentsFb Average number of days between 2 comments UGC Vol T 62 Nr1WeekBfrRelCommentsFb Number of comments posted 1 week before the release UGC Vol T 63 Pct1WeekBfrRelCommentsFb Percentage of comments posted 1 week before the release UGC Vol T 64 Nr2WeeksAftrRelCommentsFb Number of comments posted 2 weeks after the release UGC Vol T 65 Pct2WeeksAftrRelCommentsFb Percentage of comments posted 2 weeks after the release UGC Vol T 66 NrPosCommentsFb Number of positive comments UGC Val U 67 NrNegCommentsFb Number of negative comments UGC Val U 68 NrNeutralCommentsFb Number of neutral comments UGC Val U 69 RatioPosNegCommentsFb Ratio of positive versus negative comments UGC Val U 70 PctPosCommentsFb Percentage of comments being positive UGC Val U 71 PctNegCommentsFb Percentage of comments being negative UGC Val U 72 PctNeutralCommentsFb Percentage of comments being neutral UGC Val U 73 NrPosBfrRelCommentsFb Number of comments before the release being positive UGC Val T 74 NrNegBfrRelCommentsFb Number of comments before the release being negative UGC Val T 75 NrNeutralBfrRelCommentsFb number of comments before the release being neutral UGC Val T 76 PctPosBfrRelCommentsFb Percentage of comments before release being positive UGC Val T 77 PctNegBrfRelCommentsFb Percentage of comments before release being negative UGC Val T 78 PctNeutralBfrRelCommentsFb Percentage of comments before release being neutral UGC Val T 79 NrPosAftrRelCommentsFb Number of comments after the release being positive UGC Val T 80 NrNegAftrRelCommentsFb Number of comments after the release being negative UGC Val T 81 NrNeutralAftrRelCommentsFb Number of comments after the release being neutral UGC Val T 82 RatioPosNegAftrRelCommentsFb Ratio of positive versus negative comments after the release UGC Val T 83 PctPosAftrRelCommentsFb Percentage of comments after the release being positive UGC Val T 84 PctNegAftrRelCommentsFb Percentage of comments after the release being negative UGC Val T 85 PctNeutralAftrRelCommentsFb Percentage of comments after the release being neutral UGC Val T 86 NrPos1WeekBfrRelCommentsFb Number of comments 1 week before the release being positive UGC Val T 87 NrNeg1WeekBfrRelCommentsFb Number of comments 1 week before the release being negative UGC Val T 88 NrNeutral1WeekBfrRelCommentsFb Number of comments 1 week before the release being neutral UGC Val T 89 NrPos2WeeksAftrRelCommentsFb Number of comments 2 weeks after the release being positive UGC Val T 90 NrNeg2WeeksAftrRelCommentsFb Number of comments 2 weeks after the release being negative UGC Val T 91 NrNeutral2WeeksAftrRelCommentsFb Number of comments 2 weeks after the release being neutral UGC Val T 92 HypeFactorCommentsFb Hype factor of the comments UGC Vol U 93 NrBfrRelCommentsFb Number of comments posted before the release UGC Vol T 94 NrAftrRelCommentsFb Number of comments posted after the release UGC Vol T 95 PctBfrRelCommentsFb Percentage of comments posted before the release UGC Vol T 96 PctAftrRelCommentsFb Percentage of comments posted after the release UGC Vol T 97 AvgSentScoreCommentsFb Average sentiment score of the comments UGC Val U 98 AvgSentAftrRelCommentsFb Average sentiment score of the comments after the release UGC Val T 99 AvgNrOfCommentsPerDayCommentsFb Average number of comments per day UGC Vol T 100 AvgNrOfCommentsPerDayBfrRelCommentsFb Average number of comments per day before the release UGC Vol T 101 AvgNrOfCommentsPerDayAftrRelCommentsFb Average number of comments per day after the release UGC Vol T 102 AvgNrOfCommentsPerDay1WeekBfrRelCommentsFb Average number of comments 1 week before the release UGC Vol T 103 AvgNrOfCommentsPerDay2WeeksAftrRelCommentsFb Average number of comments 2 weeks after the release UGC Vol T

2-2

Table 14: Overview and explanation variables TwTweet (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name Variable explanation PPI/PGC/UGC Vol/Val U/T 1 NrOfFollowersTw Number of followers on a Twitter page PPI - - 2 NrOfTweets Number of tweets on a Twitter page PGC Vol U 3 NrOfLikesTw Number of likes on a Twitter page PPI - - 4 AvgNrOfFavoritedTw Average number of times a tweet is favorited PGC Vol U 5 AvgNrOfRetweetedTw Average number of times a tweet is retweeted PGC Vol U 6 PctIsRetweetTw Percentage of tweets being a retweet PGC Vol U 7 AvgLengthTweet Average number of characters of a tweet PGC Vol U 8 AvgNrofDaysBetweenTweets Average number of days between 2 tweets PGC Vol T 9 NrBeforeReleaseTw Number of tweets posted before the release PGC Vol T 10 NrAfterReleaseTw Number of tweets posted after the release PGC Vol T 11 PctBeforeReleaseTw Percentage of tweets posted before the release PGC Vol T 12 PctAfterReleaseTw Percentage of tweets posted after the release PGC Vol T 13 Nr1weekBfrRelTw Number of tweets posted 1 week before the release PGC Vol T 14 Pct1WeekBfrReleaseTw Percentage of tweets posted 1 week before the release PGC Vol T 15 Nr2WeekAftrRelTw Number of tweets posted 2 weeks after the release PGC Vol T 16 Pct2WeekAftrRelTw Percentage of tweets posted 2 weeks after the release PGC Vol T 17 NrPosTw Number of tweets being positive PGC Val U 18 NrNegTw Number of tweets being negative PGC Val U 19 NrNeutralTw Number of tweets being neutral PGC Val U 20 RatioPosNegTw Ratio positive versus negative tweets PGC Val U 21 PctPosTw Percentage of tweets being positive PGC Val U 22 PctNegTw Percentage of tweets being negative PGC Val U 23 PctNeutralTw Percentage of tweets being neutral PGC Val U 24 NrPosBfrRelTw Number of tweets before the release being positive PGC Val T 25 NrNegBfrRelTw Number of tweets before the release being negative PGC Val T 26 NrNeutralBfrRelTw Number of tweets before the release being neutral PGC Val T 27 RatioPosNegBfrRelTw Ratio positive versus negative tweets before the release PGC Val T 28 PctPosBfrRelTw Percentage of tweets before the release being positive PGC Val T 29 PctNegBfrRelTw Percentage of tweets before the release being negative PGC Val T 30 PctNeutralBfrRelTw Percentage of tweets before the release being neutral PGC Val T 31 NrPosAftrRelTw Number of tweets after the release being positive PGC Val T 32 NrNegAftrRelTw Number of tweets after the release being negative PGC Val T 33 NrNeutralAftrRelTw Number of tweets after the release being neutral PGC Val T 34 RatioPosNegAftrRelTw Ratio positive and negative tweets after the release PGC Val T 35 PctPosAftrRelTw Percentage of tweets after the release being positive PGC Val T 36 PctNegAftrRelTw Percentage of tweets after the release being negative PGC Val T 37 PctNeutralAftrRelTw Percentage of tweets after the release being neutral PGC Val T 38 NrPos1WeekBfrRelTw Number of tweets 1 week before the release being positive PGC Val T 39 NrNeg1WeekBfrRelTw Number of tweets 1 week before the release being negative PGC Val T 40 NrNeutral1WeekBfrRelTw Number of tweets 1 week before the release being neutral PGC Val T 41 NrPos2WeeksAftrRelTw Number of tweets 2 weeks after the release being positive PGC Val T 42 NrNeg2WeeksAftrRelTw Number of tweets 2 weeks after the release being negative PGC Val T 43 RatioPosNeg2WeeksAftrRelTw Ratio positive versus negative tweets 2 weeks after the release PGC Val T 44 PctPos2WeeksAftrRelTw Percentage of tweets 2 weeks after the release being positive PGC Val T 45 PctNeg2WeeksAftrRelTw Percentage of tweets 2 weeks after the release being negative PGC Val T 46 AvgSentScoreTw Average sentiment score tweets PGC Val U 47 AvgSentBfrRelTw Average sentiment score tweets before the release PGC Val T 48 AvgSentAftrRelTw Average sentiment score tweets after the release PGC Val T 49 AvgSent2WeeksAftrRelTw Average sentiment score tweets 2 weeks after the release PGC Val T 50 ChangeSentPrePostRelTw Change in sentiment score tweets before and after the release PGC Val T 51 PctChangeSentPrePostRelTw Percentage change in sentiment score tweets before and after the release PGC Val T 52 AvgNrOfTweetsPerDayTweetsTw Average number of tweets per day PGC Vol T 53 AvgNrOfTweetsPerDayBfrRelTweetsTw Average number of tweets per day before the release PGC Vol T 54 AvgNrOfTweetsPerDayAftrRelTweetsTw Average number of tweets per day after the release PGC Vol T 55 AvgNrOfTweetsPerDay1WeekBfrRelTweetsTw Average number of tweets per day 1 week before the release PGC Vol T 56 AvgNrOfTweetsPerDay2WeeksAftrRelTweetsTw Average number of tweets per day 2 weeks after the release PGC Vol T 57 NrNeutral2WeeksAftrRelTw Number of neutral tweets 2 weeks after the release PGC Val T 58 PctNeutral2WeeksAftrRelTw Percentage of tweets 2 weeks after the release being neutral PGC Val T

2-3

Table 15: Overview and explanation additional variables TwAll (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name Variable explanation PPI/PGC/UGC Vol/Val U/T 59 NrofRepliesTw Number of replies UGC Vol U 60 AvgNrOfRepliesTw Average number of replies per tweet UGC Vol U 61 AvgNrOfRetweetedRepliesTw Average number of times a reply is retweeted UGC Vol U 62 AvgNrOfFavoritedRepliesTw Average number of times a reply is favorited UGC Vol U 63 AvgLengthReplyTw Average number of characters per reply UGC Vol U 64 AvgNrOfDaysBetweenRepliesTw Average number of days between 2 replies UGC Vol T 65 NrBfrRelRepliesTw Number of replies before the release UGC Vol T 66 NrAftrRelRepliesTw Number of replies after the release UGC Vol T 67 PctBfrRelRepliesTw Percentage of replies posted before the release UGC Vol T 68 PctAftrRelRepliesTw Percentage of replies posted after the release UGC Vol T 69 Nr1WeekBfrRelRepliesTw Number of replies 1 week before the release UGC Vol T 70 Pct1WeekBfrRelRepliesTw Percentage of replies 1 week before the release UGC Vol T 71 Nr2WeeksAftrRelRepliesTw Number of replies 2 weeks after the release UGC Vol T 72 Pct2WeeksAftrRelRepliesTw Percentage of replies 2 weeks after the release UGC Vol T 73 NrPosRepliesTw Number of positive replies UGC Val U 74 NrNegRepliesTw Number of negative replies UGC Val U 75 NrNeutralRepliesTw Number of neutral replies UGC Val U 76 RatioPosNegRepliesTw Ratio positive versus negative replies UGC Val U 77 PctPosRepliesTw Percentage of replies being positive UGC Val U 78 PctNegRepliesTw Percentage of replies being negative UGC Val U 79 PctNeutralRepliesTw Percentage of replies being neutral UGC Val U 80 NrPosBfrRelRepliesTw Number of replies before the release being positive UGC Val T 81 NrNegBfrRelRepliesTw Number of replies before the release being negative UGC Val T 82 NrNeutralBfrRelRepliesTw Number of replies before the release being neutral UGC Val T 83 PctPosBfrRelRepliesTw Percentage of replies before the release being positive UGC Val T 84 PctNegBrfRelRepliesTw Percentage of replies before the release being negative UGC Val T 85 PctNeutralBfrRelRepliesTw Percentage of replies before the release being neutral UGC Val T 86 NrPosAftrRelRepliesTw Number of replies after the release being positive UGC Val T 87 NrNegAftrRelRepliesTw Number of replies after the release being negative UGC Val T 88 RatioPosNegAftrRelRepliesTw Ratio positive versus negative replies after the release UGC Val T 89 PctPosAftrRelRepliesTw Percentage of replies after the release being positive UGC Val T 90 PctNegAftrRelRepliesTw Percentage of replies after the release being negative UGC Val T 91 NrPos1WeekBfrRelRepliesTw Number of replies 1 week before the release being positive UGC Val T 92 NrNeg1WeekBfrRelRepliesTw Number of replies 1 week before the release being negative UGC Val T 93 NrNeutral1WeekBfrRelRepliesTw Number of replies 1 week before the release being neutral UGC Val T 94 NrPos2WeeksAftrRelRepliesTw Number of replies 2 weeks after the release being positive UGC Val T 95 NrNeg2WeeksAftrRelRepliesTw Number of replies 2 weeks after the release being negative UGC Val T 96 NrNeutral2WeeksAftrRelRepliesTw Number of replies 2 weeks after the release being neutral UGC Val T 97 AvgSentScoreRepliesTw Average sentiment score replies UGC Val U 98 AvgSentAftrRelRepliesTw Average sentiment score replies after the release UGC Val T 99 AvgNrOfRepliesPerDayRepliesTw Average number of replies per day UGC Vol T 100 AvgNrOfRepliesPerDayBfrRelRepliesTw Average number of replies per day before the release UGC Vol T 101 AvgNrOfRepliesPerDayAftrRelRepliesTw Average number of replies per day after the release UGC Vol T 102 AvgNrOfRepliesPerDay1WeekBfrRelRepliesTw Average number of replies per day 1 week before the release UGC Vol T 103 AvgNrOfRepliesPerDay2WeeksAftrRelRepliesTw Average number of replies per day 2 weeks after the release UGC Vol T 104 HypeFactorRepliesTw Hype factor replies UGC Vol U 105 NrNeutralAftrRelRepliesTw Number of replies after the release being neutral UGC Val T 106 PctNeutralAftrRelRepliesTw Percentage of replies after the release being neutral UGC Val T

2-4

Attachment 3 Variable importances

Table 16: Variable importances FbAll (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 1 NrOfLikesFb 0.059233572 PPI - - 2 HypeFactorCommentsFb 0.011336754 UGC Vol U 3 NrTalkingAboutFb 0.003150166 PGC Vol U 4 Nr2WeeksAftrRelCommentsFb 0.00310047 UGC Vol T 5 AvgLengthPostFb 0.002899706 PGC Vol U 6 PctAftrRelCommentsFb 0.002748436 UGC Vol T 7 AvgSentScoreCommentsFb 0.002536616 UGC Val U 8 AvgSentAftrRelCommentsFb 0.002302529 UGC Val T 9 AvgNrOfLikesFb 0.002288758 PGC Vol U 10 RatioPosNegCommentsFb 0.002210272 UGC Val U 11 NrNeutral1WeekBfrRelCommentsFb 0.002084099 UGC Val T 12 AvgNrOfDaysBetweenCommentsFb 0.002012703 UGC Vol U 13 AvgNrOfCommentsFb 0.001891327 UGC Vol U 14 NrNegBfrRelCommentsFb 0.001683776 UGC Val T 15 NrPos1WeekBfrRelFb 0.001633256 PGC Val T 16 NrNeg2WeeksAftrRelCommentsFb 0.001294273 UGC Val T 17 AvgNrOfCommentsPerDay2WeeksAftrRelCommentsFb 0.001132755 UGC Vol T 18 NrPos1WeekBfrRelCommentsFb 0.001053214 UGC Val T 19 Pct1WeekBfrRelCommentsFb 0.001050065 UGC Vol T 20 PctPosAftrRelFb 0.001043968 PGC Val T 21 NrPosCommentsFb 0.001005735 UGC Val U 22 NrNeutral2WeeksAftrRelCommentsFb 0.00097894 UGC Val T 23 PctPosCommentsFb 0.000965039 UGC Val U 24 Pct1WeekBfrReleaseFb 0.000938676 PGC Vol T 25 RatioPosNegAftrRelCommentsFb 0.000917156 UGC Val T 26 AvgSentAftrRelFb 0.000916474 PGC Val T 27 RatioPosNegBfrRelFb 0.0008872 PGC Val T 28 NrNeutralBfrRelCommentsFb 0.000886923 UGC Val T 29 Pct2WeeksAftrReleaseFb 0.000885777 PGC Vol T 30 NrPosAftrRelCommentsFb 0.000802458 UGC Val T 31 PctBeforeReleaseFb 0.000799479 PGC Vol T 32 AvgNrOfDaysBetweenPostsFb 0.000750939 PGC Vol U 33 PctNeutralCommentsFb 0.000746921 UGC Val U 34 AvgNrOfCommentsPerDayBfrRelCommentsFb 0.00073755 UGC Vol T 35 AvgSentBfrRelFb 0.000736951 PGC Val T 36 Nr1WeekBfrRelFb 0.000733388 PGC Vol T 37 AvgSentScoreFb 0.000648516 PGC Val U 38 Nr1WeekBfrRelCommentsFb 0.000638725 UGC Val T 39 NrNeutralBfrRelFb 0.000615251 PGC Val T 40 AvgNrOfCommentsPerDayAftrRelCommentsFb 0.000611939 UGC Vol T 41 NrNegCommentsFb 0.000575964 UGC Val U 42 AvgNrLikesCommentFb 0.000573602 UGC Vol U 43 AvgNrOfPostsPerDayAftrRelPostsFb 0.000558531 PGC Vol T 44 PctBfrRelCommentsFb 0.000537102 UGC Vol T 45 PctAfterReleaseFb 0.000536526 PGC Vol T 46 NrNeutralFb 0.000535423 PGC Val U 47 AvgSent2WeeksAftrRelFb 0.000527532 PGC Val T 48 NrNeutral1WeekBfrRelFb 0.000509987 PGC Val T 49 NrPos2WeeksAftrRelFb 0.00049831 PGC Val T 50 PctNegCommentsFb 0.000496625 PGC Val T 51 NrNegBfrRelFb 0.00048592 PGC Val T

3-1

(Table 16 – Part 2)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 52 AvgNrOfCommentsPerDay1WeekBfrRelCommentsFb 0.000485822 UGC Vol T 53 PctNeutralBfrRelFb 0.00047041 PGC Val T 54 RatioPosNegAftrRelFb 0.000456489 PGC Val T 55 NrNegFb 0.000443205 PGC Val U 56 NrNegAftrRelFb 0.000428957 PGC Val T 57 RatioPosNegFb 0.000416811 PGC Val U 58 PctNeutralFb 0.000399708 PGC Val U 59 NrNeg1WeekBfrRelCommentsFb 0.000399394 UGC Val T 60 NrAFterReleaseFb 0.000390431 PGC Vol T 61 AvgLengthCommentFb 0.000368679 UGC Vol U 62 NrOfPostsFb 0.000346173 PGC Vol U 63 PctNegAftrRelFb 0.000342155 PGC Val T 64 PctNeutralAftrRelFb 0.000341166 PGC Val T 65 Pct2WeeksAftrRelCommentsFb 0.000336866 UGC Vol T 66 ChangeSentPrePostRelFb 0.000307807 PGC Val T 67 PctNegBrfRelCommentsFb 0.000305464 UGC Val T 68 NrBfrRelCommentsFb 0.000302337 UGC Vol T 69 PctPosBfrRelFb 0.000282598 PGC Val T 70 PctPos2WeeksAftrRelFb 0.000269666 PGC Val T 71 NrNeg2WeeksAftrRelFb 0.000260848 PGC Val T 72 PctPosFb 0.000260343 PGC Val U 73 PctNegBfrRelFb 0.000259127 PGC Val T 74 NrNeg1WeekBfrRelFb 0.000250267 PGC Val T 75 PctNeg2WeeksAftrRelFb 0.000243546 PGC Val T 76 NrBeforeReleaseFb 0.000238737 PGC Vol T 77 AvgNrOfPostsPerDayPostsFb 0.000229793 PGC Vol U 78 NrPosAftrRelFb 0.000217603 PGC Val T 79 NrofCommentsFb 0.000215346 PGC Vol U 80 AvgNrOfPostsPerDay1WeekBfrRelPostsFb 0.000199124 PGC Vol T 81 PctPosBfrRelCommentsFb 0.000198367 UGC Val T 82 PctNegFb 0.000190597 PGC Val U 83 NrPosFb 0.000185369 PGC Val U 84 PctPosAftrRelCommentsFb 0.000176242 UGC Val T 85 NrPosBfrRelFb 0.000173117 PGC Val T 86 NrNeutralAftrRelFb 0.000170896 PGC Val T 87 AvgNrOfPostsPerDay2WeeksAftrRelPostsFb 0.00016473 PGC Vol T 88 PctNeutralBfrRelCommentsFb 0.00016344 UGC Val T 89 NrNeutralCommentsFb 0.000152312 UGC Val U 90 NrPosBfrRelCommentsFb 0.000148385 UGC Val T 91 Nr2WeekAftrRelFb 0.000146559 PGC Vol T 92 RatioPosNeg2WeeksAftrRelFb 0.000144365 PGC Val T 93 NrPos2WeeksAftrRelCommentsFb 0.000142077 UGC Val T 94 AvgNrOfCommentsPerDayCommentsFb 0.000139521 UGC Vol U 95 NrNeutral2WeeksAftrRelFb 0.000124336 PGC Val T 96 PctNegAftrRelCommentsFb 0.000122126 UGC Val T 97 NrNegAftrRelCommentsFb 0.000122015 UGC Val T 98 NrNeutralAftrRelCommentsFb 0.000086709 UGC Val T 99 AvgNrOfPostsPerDayBfrRelPostsFb 0.000069906 PGC Vol T 100 PctNeutralAftrRelCommentsFb 0.000063441 UGC Val T 101 PctChangeSentPrePostRelFb 0.000054081 PGC Val T 102 PctNeutral2WeeksAftrRelFb 0.000039436 PGC Val T 103 NrAftrRelCommentsFb 0.000000000 UGC Vol T

3-2

Table 17: Variable importances TwAll (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 1 NrOfFollowersTw 0.038163855 PPI - - 2 AvgNrOfRetweetedRepliesTw 0.005223189 UGC Vol U 3 AvgLengthReplyTw 0.004376805 UGC Vol U 4 AvgNrOfFavoritedTw 0.004024057 PGC Vol U 5 AvgNrOfRepliesTw 0.003775183 UGC Vol U 6 Pct1WeekBfrReleaseTw 0.003005681 PGC Vol T 7 PctNeutral2WeeksAftrRelTw 0.002580824 PGC Val T 8 PctBfrRelRepliesTw 0.00229981 UGC Vol T 9 PctPos2WeeksAftrRelTw 0.001906647 PGC Val T 10 Pct2WeekAftrRelTw 0.001895665 PGC Vol T 11 RatioPosNegTw 0.001795578 PGC Val U 12 PctAftrRelRepliesTw 0.001787359 UGC Vol T 13 Pct2WeeksAftrRelRepliesTw 0.001715623 UGC Vol T 14 AvgLengthTweet 0.001698326 PGC Vol U 15 PctPosBfrRelRepliesTw 0.001499594 UGC Val T 16 NrofRepliesTw 0.001374348 UGC Vol U 17 AvgNrOfRetweetedTw 0.001300386 PGC Vol U 18 PctPosAftrRelTw 0.00124755 PGC Val T 19 NrPosRepliesTw 0.001194018 UGC Val U 20 AvgNrOfTweetsPerDayBfrRelTweetsTw 0.001193191 PGC Vol T 21 Pct1WeekBfrRelRepliesTw 0.001192362 UGC Vol T 22 RatioPosNeg2WeeksAftrRelTw 0.001175513 PGC Val T 23 AvgSent2WeeksAftrRelTw 0.001135873 PGC Vol T 24 NrOfLikesTw 0.001135029 PPI - - 25 Nr2WeeksAftrRelRepliesTw 0.001100214 UGC Vol T 26 AvgSentBfrRelTw 0.001017843 PGC Val T 27 RatioPosNegAftrRelTw 0.00097305 PGC Val T 28 PctPosTw 0.000963788 PGC Val U 29 PctNeutralAftrRelTw 0.000949205 PGC Val T 30 AvgNrOfTweetsPerDay2WeeksAftrRelTweetsTw 0.000937676 UGC Vol T 31 NrPosBfrRelRepliesTw 0.000923584 PGC Val T 32 PctAfterReleaseTw 0.000842966 UGC Vol T 33 AvgNrOfFavoritedRepliesTw 0.000838125 PGC Vol U 34 PctIsRetweetTw 0.000823227 UGC Vol U 35 NrAftrRelRepliesTw 0.000811838 UGC Vol T 36 PctPosRepliesTw 0.000776408 UGC Val U 37 PctNeutralTw 0.000738946 PGC Val U 38 NrPos2WeeksAftrRelTw 0.000733822 PGC Val T 39 NrPos1WeekBfrRelTw 0.000708351 PGC Val T 40 AvgSentAftrRelTw 0.000636766 PGC Val T 41 PctPosAftrRelRepliesTw 0.000625271 UGC Val T 42 NrBfrRelRepliesTw 0.000615505 UGC Vol T 43 NrNeg2WeeksAftrRelTw 0.000612833 PGC Val T 44 AvgSentAftrRelRepliesTw 0.000596564 UGC Val T 45 RatioPosNegRepliesTw 0.000577439 UGC Val U 46 NrNeutralRepliesTw 0.000565386 UGC Val U 47 AvgSentScoreRepliesTw 0.00055603 UGC Val U 48 PctNeg2WeeksAftrRelTw 0.000550939 PGC Val T 49 PctNegAftrRelTw 0.000494987 PGC Val T 50 PctNeutralBfrRelRepliesTw 0.000483935 UGC Val T 51 PctNegTw 0.000483588 PGC Val U

3-3

(Table 17 - Part 2)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 52 NrAfterReleaseTw 0.000481402 PGC Vol T 53 PctNegRepliesTw 0.000477777 UGC Val U 54 NrOfTweets 0.000471264 PGC Vol U 55 NrNegAftrRelTw 0.000470625 PGC Val T 56 PctNegBfrRelTw 0.000453752 PGC Val T 57 AvgNrOfRepliesPerDayRepliesTw 0.00045058 UGC Vol U 58 NrNeutralBfrRelTw 0.000438375 PGC Val T 59 PctBeforeReleaseTw 0.000433771 PGC Vol T 60 PctNeutralBfrRelTw 0.000422546 PGC Val T 61 NrNeutralAftrRelRepliesTw 0.000406074 UGC Val T 62 RatioPosNegBfrRelTw 0.000398499 PGC Val T 63 NrNeg1WeekBfrRelTw 0.000391266 PGC Val T 64 NrPos2WeeksAftrRelRepliesTw 0.000385844 UGC Val T 65 AvgNrofDaysBetweenTweets 0.000373921 PGC Vol U 66 NrPosTw 0.000372165 PGC Val U 67 NrNeg1WeekBfrRelRepliesTw 0.000368904 UGC Val T 68 NrPosAftrRelRepliesTw 0.000367011 UGC Val T 69 AvgNrOfRepliesPerDay1WeekBfrRelRepliesTw 0.000362129 UGC Vol T 70 HypeFactorRepliesTw 0.00035984 UGC Vol U 71 NrNeutral1WeekBfrRelTw 0.000356663 PGC Val T 72 AvgSentScoreTw 0.000354648 PGC Val U 73 AvgNrOfRepliesPerDay2WeeksAftrRelRepliesTw 0.000351185 UGC Vol T 74 NrNegRepliesTw 0.000350469 UGC Val U 75 NrNegBfrRelRepliesTw 0.000343612 UGC Val T 76 NrPosBfrRelTw 0.000318988 PGC Val T 77 AvgNrOfRepliesPerDayAftrRelRepliesTw 0.000310066 UGC Vol T 78 PctNegAftrRelRepliesTw 0.000276849 UGC Val T 79 NrNeutralTw 0.000250738 PGC Val U 80 Nr2WeekAftrRelTw 0.000249137 PGC Vol T 81 ChangeSentPrePostRelTw 0.000210995 PGC Val T 82 NrNeg2WeeksAftrRelRepliesTw 0.000209306 UGC Val T 83 AvgNrOfTweetsPerDay1WeekBfrRelTweetsTw 0.000204164 PGC Vol T 84 AvgNrOfTweetsPerDayAftrRelTweetsTw 0.000196563 PGC Vol T 85 NrBeforeReleaseTw 0.000195268 PGC Vol T 86 RatioPosNegAftrRelRepliesTw 0.000189396 UGC Val T 87 AvgNrOfTweetsPerDayTweetsTw 0.00018015 PGC Vol U 88 NrPosAftrRelTw 0.000174172 PGC Val T 89 AvgNrOfRepliesPerDayBfrRelRepliesTw 0.000165877 UGC Vol T 90 NrNegBfrRelTw 0.000161832 PGC Val T 91 Nr1WeekBfrRelRepliesTw 0.000146794 UGC Vol T 92 PctNeutralRepliesTw 0.000144688 UGC Val U 93 Nr1weekBfrRelTw 0.000142921 PGC Vol T 94 PctPosBfrRelTw 0.000142254 PGC Val T 95 AvgNrOfDaysBetweenRepliesTw 0.000140750 UGC Vol U 96 NrNeutral2WeeksAftrRelRepliesTw 0.000130675 UGC Val T 97 NrPos1WeekBfrRelRepliesTw 0.000122409 UGC Val T 98 NrNeutralBfrRelRepliesTw 0.000111467 UGC Val T 99 NrNeutral1WeekBfrRelRepliesTw 0.000088892 UGC Val T 100 NrNeutralAftrRelTw 0.000072397 PGC Val T 101 NrNegAftrRelRepliesTw 0.000060489 UGC Val T 102 NrNeutral2WeeksAftrRelTw 0.000049193 PGC Val T 103 NrNegTw 0.000039196 PGC Val U 104 PctNeutralAftrRelRepliesTw 0.000027776 UGC Val T 105 PctNegBrfRelRepliesTw 0.000023856 UGC Val T 106 PctChangeSentPrePostRelTw 0.000023439 PGC Val T

3-4

Table 18: Variable importances FbTwAll (PPI= Page popularity indicator, PGC = page-generated content, UGC = user-generated content, Vol = volume, Val = valence, U = unbounded and T = time-based)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 1 NrOfLikesFb 0.0627388 PPI - - 2 NrNegBfrRelCommentsFb 0.0034630 UGC Val T 3 Nr2WeeksAftrRelCommentsFb 0.0033991 UGC Vol T 4 HypeFactorCommentsFb 0.0033844 UGC Vol U 5 NrNeutral2WeeksAftrRelCommentsFb 0.0029163 UGC Val T 6 AvgSentScoreCommentsFb 0.0027985 UGC Val U 7 AvgNrOfFavoritedTw 0.0025648 PGC Vol U 8 AvgNrOfRetweetedRepliesTw 0.0019395 UGC vol U 9 AvgLengthPostFb 0.0015197 PGC Vol U 10 NrPosAftrRelCommentsFb 0.0014938 UGC Val T 11 AvgNrOfRepliesTw 0.0014061 UGC Vol U 12 Pct2WeeksAftrReleaseFb 0.0013976 PGC Vol T 13 NrTalkingAboutFb 0.0013387 PGC Vol U 14 AvgNrOfCommentsFb 0.0013277 UGC Vol U 15 RatioPosNegCommentsFb 0.0012836 UGC Val U 16 NrPosCommentsFb 0.0012379 UGC Val U 17 AvgNrOfDaysBetweenPostsFb 0.0012313 PGC Vol U 18 PctNeutralCommentsFb 0.0011401 UGC Val U 19 AvgSentAftrRelCommentsFb 0.0011359 UGC Val T 20 NrNeutralBfrRelCommentsFb 0.0011331 UGC Val T 21 AvgLengthReplyTw 0.0011142 UGC Val U 22 PctPosBfrRelRepliesTw 0.0010524 UGC Val T 23 NrBfrRelCommentsFb 0.0010205 UGC Vol T 24 NrPos1WeekBfrRelFb 0.0010111 PGC Val T 25 PctPos2WeeksAftrRelTw 0.0009601 PGC Val T 26 PctNeutral2WeeksAftrRelTw 0.0009389 PGC Val T 27 NrNeutral1WeekBfrRelCommentsFb 0.0009328 UGC Val T 28 AvgNrOfTweetsPerDayBfrRelTweetsTw 0.0009042 PGC Vol T 29 PctPosFb 0.0008720 PGC Val U 30 Pct1WeekBfrReleaseTw 0.0008645 PGC Vol T 31 AvgNrOfRetweetedTw 0.0008450 PGC Vol U 32 PctPosBfrRelFb 0.0008428 PGC Val T 33 NrofCommentsFb 0.0008336 UGC Vol U 34 NrOfFollowersTw 0.0008256 PPI - - 35 PctBeforeReleaseFb 0.0008168 PGC Vol T 36 Pct2WeeksAftrRelRepliesTw 0.0007794 UGC Vol T 37 PctNeutralAftrRelRepliesTw 0.0007469 UGC Val T 38 PctPosCommentsFb 0.0007453 UGC Val U 39 AvgNrOfCommentsPerDay2WeeksAftrRelCommentsFb 0.0007219 UGC Vol T 40 AvgSent2WeeksAftrRelTw 0.0007098 PGC Val T 41 AvgNrOfCommentsPerDayBfrRelCommentsFb 0.0007057 UGC Vol T 42 NrNeutralBfrRelFb 0.0006787 PGC Val T 43 NrNeg2WeeksAftrRelCommentsFb 0.0006694 UGC Val T 44 PctBfrRelCommentsFb 0.0006557 UGC Vol T 45 NrPos1WeekBfrRelCommentsFb 0.0006334 UGC Val T 46 PctBfrRelRepliesTw 0.0006123 UGC Vol T 47 AvgNrOfCommentsPerDayAftrRelCommentsFb 0.0006097 UGC Vol T 48 PctIsRetweetTw 0.0006096 PGC Vol U 49 PctPosAftrRelFb 0.0006081 PGC Val T 50 AvgNrOfRepliesPerDayRepliesTw 0.0005865 UGC Vol U

3-5

(Table 18 – Part 2)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 51 HypeFactorRepliesTw 0.0005754 UGC Vol U 52 Nr1WeekBfrRelCommentsFb 0.0005691 UGC Vol T 53 NrNegCommentsFb 0.0005642 UGC Val U 54 PctPosAftrRelTw 0.0005593 PGC Val T 55 Pct1WeekBfrRelRepliesTw 0.0005317 UGC Vol T 56 PctNeg2WeeksAftrRelTw 0.0005134 PGC Val T 57 AvgNrOfCommentsPerDayCommentsFb 0.0005118 UGC Vol U 58 RatioPosNegRepliesTw 0.0005084 UGC Val U 59 Nr1WeekBfrRelFb 0.0004886 PGC Vol T 60 NrNeutralAftrRelCommentsFb 0.0004794 UGC Val T 61 NrNegBfrRelFb 0.0004740 PGC Val T 62 NrNeutralBfrRelTw 0.0004734 PGC Val T 63 NrBeforeReleaseFb 0.0004689 PGC Vol T 64 RatioPosNeg2WeeksAftrRelTw 0.0004585 PGC Val T 65 PctNegBrfRelCommentsFb 0.0004478 UGC Val T 66 AvgNrofDaysBetweenTweets 0.0004410 PGC Vol U 67 RatioPosNeg2WeeksAftrRelFb 0.0004400 PGC Val T 68 AvgNrLikesCommentFb 0.0004355 UGC Vol U 69 AvgSentAftrRelTw 0.0004195 PGC Val U 70 AvgNrOfCommentsPerDay1WeekBfrRelCommentsFb 0.0004070 UGC Vol T 71 AvgLengthTweet 0.0004053 PGC Vol U 72 RatioPosNegTw 0.0004033 PGC Val U 73 PctBeforeReleaseTw 0.0003973 PGC Vol T 74 AvgNrOfFavoritedRepliesTw 0.0003909 UGC Vol U 75 NrNeutralFb 0.0003899 PGC Val U 76 PctNegAftrRelFb 0.0003891 PGC Val T 77 PctNeutralBfrRelFb 0.0003866 PGC Val T 78 ChangeSentPrePostRelFb 0.0003757 PGC Val T 79 NrPosRepliesTw 0.0003757 UGC Val U 80 PctNeutralAftrRelTw 0.0003707 PGC Val T 81 PctNeutral2WeeksAftrRelFb 0.0003694 PGC Val T 82 AvgSentScoreFb 0.0003668 PGC Val U 83 AvgSentScoreTw 0.0003644 PGC Val U 84 NrNeutralCommentsFb 0.0003564 UGC Val U 85 NrNeutral2WeeksAftrRelFb 0.0003498 PGC Val T 86 Pct2WeekAftrRelTw 0.0003455 PGC Vol T 87 PctNeutralTw 0.0003418 PGC Val U 88 NrNeutral1WeekBfrRelTw 0.0003382 PGC Val T 89 PctNegBfrRelTw 0.0003351 PGC Val T 90 Nr2WeekAftrRelFb 0.0003328 PGC Vol T 91 AvgSentBfrRelTw 0.0003306 PGC Val T 92 AvgNrOfPostsPerDayPostsFb 0.0003292 PGC Vol U 93 NrNegBfrRelTw 0.0003283 PGC Val T 94 NrNegAftrRelCommentsFb 0.0003230 UGC Val T 95 NrPos2WeeksAftrRelCommentsFb 0.0003207 UGC Val T 96 AvgSentAftrRelFb 0.0003203 PGC Val T 97 AvgNrOfPostsPerDayAftrRelPostsFb 0.0003203 PGC Vol U 98 PctNegCommentsFb 0.0003172 UGC Val U 99 AvgNrOfDaysBetweenRepliesTw 0.0003141 UGC Vol U 100 NrNeg2WeeksAftrRelFb 0.0002964 PGC Val T

3-6

(Table 18 – Part 3)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 101 RatioPosNegBfrRelTw 0.0002912 PGC Val U 102 PctPosTw 0.0002895 PGC Val U 103 NrNeg1WeekBfrRelFb 0.0002892 PGC Val T 104 NrPosBfrRelFb 0.0002879 PGC Val T 105 NrBeforeReleaseTw 0.0002873 PGC Vol T 106 PctNeutralBfrRelRepliesTw 0.0002853 UGC Val T 107 PctPosBfrRelTw 0.0002853 PGC Val T 108 PctPosAftrRelRepliesTw 0.0002848 UGC Val T 109 NrNeg1WeekBfrRelTw 0.0002828 PGC Val T 110 PctNeutralBfrRelTw 0.0002809 PGC Val T 111 AvgNrOfPostsPerDayBfrRelPostsFb 0.0002763 PGC Vol T 112 PctPosRepliesTw 0.0002721 UGC Val U 113 RatioPosNegAftrRelCommentsFb 0.0002720 UGC Val T 114 RatioPosNegAftrRelTw 0.0002688 PGC Val T 115 AvgNrOfLikesFb 0.0002577 PGC Vol U 116 PctAftrRelRepliesTw 0.0002494 UGC Vol T 117 PctNegBfrRelFb 0.0002460 PGC Val T 118 NrPosAftrRelRepliesTw 0.0002411 UGC Val T 119 PctAfterReleaseFb 0.0002385 PGC Vol T 120 NrPosBfrRelTw 0.0002378 PGC Val T 121 NrNeg1WeekBfrRelCommentsFb 0.0002359 UGC Val T 122 Pct1WeekBfrReleaseFb 0.0002346 PGC Vol T 123 NrOfPostsFb 0.0002314 PGC Val U 124 AvgSent2WeeksAftrRelFb 0.0002233 PGC Val T 125 PctPosAftrRelCommentsFb 0.0002231 UGC Val T 126 AvgNrOfPostsPerDay1WeekBfrRelPostsFb 0.0002145 PGC Vol T 127 ChangeSentPrePostRelTw 0.0002139 PGC Val T 128 NrNeutralAftrRelRepliesTw 0.0002087 UGC Val T 129 PctNeg2WeeksAftrRelFb 0.0002084 PGC Val T 130 RatioPosNegAftrRelRepliesTw 0.0002058 UGC Val T 131 PctNeutralFb 0.0002044 PGC Val U 132 NrPos2WeeksAftrRelTw 0.0002041 PGC Val T 133 NrPos2WeeksAftrRelRepliesTw 0.0001891 UGC Val T 134 PctNegTw 0.0001875 PGC Val U 135 PctNeutralRepliesTw 0.0001860 UGC Val U 136 PctNeutralBfrRelCommentsFb 0.0001856 UGC Val T 137 AvgSentScoreRepliesTw 0.0001751 UGC Val U 138 PctNegBrfRelRepliesTw 0.0001728 UGC Val T 139 Nr2WeeksAftrRelRepliesTw 0.0001686 UGC Vol T 140 NrPos2WeeksAftrRelFb 0.0001643 PGC Val T 141 AvgSentAftrRelRepliesTw 0.0001643 UGC Val T 142 NrNegTw 0.0001634 PGC Val U 143 NrNegAftrRelFb 0.0001633 PGC Val T 144 NrNeutral2WeeksAftrRelTw 0.0001586 PGC Val T 145 NrPos1WeekBfrRelTw 0.0001575 PGC Val T 146 NrPosFb 0.0001569 PGC Val U 147 PctAfterReleaseTw 0.0001563 UGC Vol T 148 NrPosAftrRelTw 0.0001559 PGC Val T 149 NrAfterReleaseTw 0.0001534 PGC Vol T 150 PctPosBfrRelCommentsFb 0.0001510 UGC Val T 151 Pct2WeeksAftrRelCommentsFb 0.0001499 UGC Vol T 152 NrAFterReleaseFb 0.0001469 PGC Vol T

3-7

(Table 18 – Part 4)

Nr. Variable name MeanIncreaseRMSE PPI/PGC/UGC Vol/Val U/T 153 NrPosBfrRelCommentsFb 0.0001446 UGC Val T 154 RatioPosNegBfrRelFb 0.0001308 PGC Val T 155 Nr1weekBfrRelTw 0.0001295 PGC Vol T 156 AvgLengthCommentFb 0.0001270 UGC Vol U 157 Pct1WeekBfrRelCommentsFb 0.0001256 UGC Vol T 158 NrNeutralAftrRelTw 0.0001244 PGC Val T 159 NrOfLikesTw 0.0001241 PPI - - 160 AvgNrOfRepliesPerDay2WeeksAftrRelRepliesTw 0.0001227 UGC Vol T 161 AvgNrOfTweetsPerDayAftrRelTweetsTw 0.0001214 PGC Vol T 162 NrPosAftrRelFb 0.0001209 PGC Val T 163 RatioPosNegAftrRelFb 0.0001183 PGC Val T 164 AvgNrOfRepliesPerDayBfrRelRepliesTw 0.0001181 UGC Vol T 165 NrPos1WeekBfrRelRepliesTw 0.0000954 UGC Val T 166 AvgNrOfTweetsPerDay1WeekBfrRelTweetsTw 0.0000917 PGC Vol T 167 NrNeutral1WeekBfrRelFb 0.0000889 PGC Val T 168 NrOfTweets 0.0000872 PGC Vol U 169 PctPos2WeeksAftrRelFb 0.0000865 PGC Val T 170 PctChangeSentPrePostRelFb 0.0000836 PGC Val T 171 NrBfrRelRepliesTw 0.0000807 UGC Vol T 172 NrAftrRelRepliesTw 0.0000762 UGC Vol T 173 NrNeutralAftrRelFb 0.0000756 PGC Val T 174 AvgNrOfTweetsPerDay2WeeksAftrRelTweetsTw 0.0000676 PGC Vol T 175 NrNeutralTw 0.0000636 PGC Val U 176 PctNegAftrRelCommentsFb 0.0000609 UGC Val T 177 AvgNrOfRepliesPerDayAftrRelRepliesTw 0.0000585 UGC Vol T 178 PctNeutralAftrRelFb 0.0000574 PGC Val T 179 NrofRepliesTw 0.0000557 PGC Vol U 180 NrPosTw 0.0000555 PGC Val U 181 PctNegAftrRelTw 0.0000513 PGC Val T 182 NrNegAftrRelRepliesTw 0.0000474 UGC Val T 183 Nr2WeekAftrRelTw 0.0000472 PGC Vol T 184 NrNeutralRepliesTw 0.0000462 UGC Val U 185 NrNegAftrRelTw 0.0000461 PGC Val T 186 NrNeutralBfrRelRepliesTw 0.0000403 UGC Val T 187 NrPosBfrRelRepliesTw 0.0000368 UGC Val T 188 PctNeutralAftrRelCommentsFb 0.0000324 UGC Val T 189 Nr1WeekBfrRelRepliesTw 0.0000258 UGC Vol T 190 NrNeutral2WeeksAftrRelRepliesTw 0.0000244 UGC Val T 191 NrNeutral1WeekBfrRelRepliesTw 0.0000217 UGC Val T 192 RatioPosNegFb 0.0000207 PGC Val U 193 PctNegFb 0.0000164 PGC Val U 194 NrNegBfrRelRepliesTw 0.0000142 UGC Val T 195 PctNegAftrRelRepliesTw 0.0000132 UGC Val T 196 NrAftrRelCommentsFb 0.0000000 UGC Vol T 197 NrNeg2WeeksAftrRelRepliesTw -0.0000004 UGC Val T 198 NrNeg2WeeksAftrRelTw -0.0000036 PGC Val T 199 PctChangeSentPrePostRelTw -0.0000098 PGC Val T 200 NrNegFb -0.0000152 PGC Val U 201 AvgNrOfTweetsPerDayTweetsTw -0.0000167 PGC Vol U 202 NrNeg1WeekBfrRelRepliesTw -0.0000256 UGC Val T 203 NrNegRepliesTw -0.0000315 UGC Val U 204 AvgNrOfPostsPerDay2WeeksAftrRelPostsFb -0.0000521 PGC Vol T 205 AvgNrOfRepliesPerDay1WeekBfrRelRepliesTw -0.0000527 UGC Vol T 206 AvgSentBfrRelFb -0.0000571 PGC Val T 207 PctNegRepliesTw -0.0002121 UGC Val U 208 AvgNrOfDaysBetweenCommentsFb -0.0046478 UGC Vol U 209 PctAftrRelCommentsFb -0.0087265 UGC Vol T

3-8

Attachment 4 Partial dependence plot FbTwAll

Figure 4: Partial dependence plot number of likes

Figure 5: Partial dependence plot number of negative comments before the release

4-1

Figure 6: Partial dependence plot number of comments two weeks after the release

Figure 7: Partial dependence plot hype factor comments

4-2

Figure 8: Partial dependence plot number of neutral comments 2 weeks after the release

Figure 9: Partial dependence plot average sentiment score of the comments

4-3

Figure 10: Partial dependence plot average number of times the tweets are favorited

Figure 11: Partial dependence plot average number of times the replies are retweeted

4-4

Figure 12: Partial dependence plot average length of the posts

Figure 13: Partial dependence plot number of positive comments after the release

4-5