The value of pre-launch volumes in predicting initials sales

of new cars

MSC Business Studies Marketing Thesis

by

Jorrit Stein 10618015

First supervisor: Ms. E.Korkmaz

Second supervisor: Dr. Umut Konus

Final version January 30, 2015

1

Statement of originality

This document is written by Student Jorrit Stein who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

2

Index Abstract ...... 4 Introduction ...... 5 Literature review ...... 10 Predicting car sales ...... 10 Advertisement ...... 13 Web data based forecasting ...... 14 Twitter ...... 18 Forecasting with Twitter data ...... 19 Conceptual framework and hypotheses ...... 21 Car model tweets ...... 24 Brand tweets about car ...... 26 Mass media ...... 27 Twitter data...... 30 Car sales data ...... 30 Sample ...... 31 Variables ...... 33 Results ...... 39 Graphical description ...... 40 Hypothesis testing ...... 45 Discussion and conclusions ...... 52 Managerial implications ...... 56 Further research ...... 58 References ...... 59 Appendix 1: Search queries Twitter ...... 63 Appendix 2: Correlation plots ...... 65 Appendix 3: Key findings literature review ...... 67

3

Abstract

The recent technological developments, the rise of big data and the increased use of the internet and has led to a new era in the prediction of consumer behavior. Organizations and scholars increasingly see the value of collecting consumer information and using this data to make predictions about future consumer behavior. This study will look at the possibility to make predictions with online word of mouth on Twitter. Specifically, this research will try to find a relation between pre- launch Twitter volumes and the initial sales of new car models. The two Twitter volumes used in this research are the consumer and brand tweets about a new car model. Furthermore, we examine the relationship between brand media expenses and both pre-launch WOM on Twitter and post-launch initial car sales. Data was collected from seventeen new car models introduced in the Netherlands in

2013 en 2014. A descriptive, correlation and regression analysis were applied to test the relationships. The results of our research show no strong support for the ability to predict initials car sales of new car with pre-launch Twitter volumes generated by either consumers or brands. Although our outcomes do indicate that pre-launch WOM on Twitter seems to have a positive influence on the initial sales of new cars. Furthermore, we did not find a relationship between pre-launch brand expenses and initial sales. We did find a strong positive relationship between post-market brand investment in mass media promotion of a new car and the initial new car sales. The thesis results indicate that post-launch media expenses by a particular brand is a better predictor of initials car sales than the pre-launch consumer and brand tweet volumes and the pre-launch media expenses.

4

Introduction

Nowadays, companies are collecting more and more consumer data. This development is caused by the widespread digitalization of world. The storage of big data combined with the technological development, new processing techniques and the increased usage of the internet and social network sites have led to a new era in the prediction of consumer behavior (Goldman Sachs Group, 2014).

This digital development provides new opportunities, which has attracted a lot of attention from scholars during the last decades, but there is still plenty of room for new research.

Business management teams increasingly see the value of collecting consumer information and using this data to make predictions about future consumer behavior. Consumer data is increasingly seen as a resource that can lead to a sustainable competitive advantage (The Economist Intelligence Unit,

2013). Consumer data is becoming such a resource in the business environment, as companies are collecting an increasing amount of divers, exclusive and unique consumer information, which is used to improve business practices. Consumer data, and in particular the results of its analysis, is a valuable resource for making strategic decision on business level which can results in the prolonged existence of a company (Barney, 1991).

Recently, scholars are increasingly interested in research topics concerning predictive consumer analytics, due to the potential value of consumer data. As mentioned before the success of social media is an important pillar for the rise of consumer data. The various social media platforms offer consumers the opportunity to generate and spread information to a large audience in an unprecedented manner. This online generated worth of mouth (WOM) appears to be a promising source of information for the prediction of consumer behavior as WOM has already proven to be very valuable in the offline world. Recent studies on the diffusion of innovations have found that the volume of WOM correlates significantly with consumer activity and market outcome (Anderson,

2003; Neelamegham and Chintagunta, 1999). WOM constitutes the basis of interpersonal

5

communications that has an important influence on product evaluations and purchase decisions by consumers. According to Grewal, Cline and Davies (2003) WOM is more powerful than business communication, because WOM is considered more credible and valuable by consumers. The power

WOM is also frightening for marketers, as informal discussions among consumers are difficult to control and can either make a product popular or unsuccessful in the market.

As mentioned earlier, recent studies on the diffusion of innovations have found significant correlations between the volume of WOM and related market results. This research on the diffusion of innovation is based on the new product diffusion theory developed by Bass (2004) and Rogers

(2004). The literature of Bass (2004) suggest that innovators in the early stage of a product life cycle are mainly affected by mass media. And after using the new products, the innovators pass their opinions to latecomers via the WOM channels. Rogers (2004) recognizes WOM as a channel of communication in the product life cycle, particularly among the early majority and late majority, who tend to base their purchase decision on the WOM from the early adopters.

The digital developments has led to a new variety of WOM, namely the online WOM. According to

Phelps et al. (2004) online WOM is even more influential than offline WOM because of its speed, convenience, wide reach and the absence of interpersonal pressure. And furthermore, because online marketers are nowadays able to archive WOM interactions from online forums in databases.

This offers organizations the opportunities to estimate their marketing effects directly and perhaps more accurately. In the past decade, researchers have carried out an increasing number of studies to understand the power of online WOM. They found a significant effect of online consumer reviews on product sales (Chen and Xie, 2008; Dellarocas, 2003; Li and Hitt, 2008; Miller, Fabian and Lin, 2009;

Cui, Lui and Guo, 2012).

6

Online WOM is supported by various digital platforms. The social networks on the internet, in particular, are a major source for online WOM. Cui, Lui and Guo, 2012 and Asur and Huberman

(2010) identify Twitter as a platform that is increasingly suitable for studying the effect of online

WOM on the adoption of new products by consumers. Twitter is a very popular microblogging website, where users can follow people of their interest, update their own status with Tweets, retweet messages of others or communicate with them directly. Since Twitter launched in March

2006, the service rapidly gained worldwide popularity and its user base has been growing exponentially, with 500 million registered user in 2012, who posted 340 million tweets and searched for 1.6 billion queries per day. In 2013 Twitter was one of the most visited websites (Wiki, 2014).

Since the success of Twitter, the platform has drawn more and more attention of researchers from various disciplines.

Today, scholars are highly interested in exploiting online WOM data in their forecasting problems.

Although evidence has been found that online WOM influences consumer purchases, there are still theoretical and empirical questions about the effect of online WOM on new product sales. This study contributes to the academic literature due to the following reasons. Firstly, despite many practical research of marketing agencies into the opportunities of the Twitter database, there exists a major shortage of academic research on the predictive capabilities of the Twitter database. Secondly, as mentioned by Cui, Lui and Guo (2012), most studies on online WOM to date have dealt with forecasting the sales of an experience products, such as books, movies, and television shows. These products are often well promoted prior to their release and attract customer reviews within a short period after their public release. Although a few studies also included more technological products

(Clemons, Goa and Hitt, 2006; Mudambi and Schuff, 2010), there is a shortage in studies focusing on the effects of online WOM on solely technical products. Thirdly, most of the existing studies focus on forecasting sales of existing products and not on new products.

7

Finally, most research examined the relationship of online WOM and sales after the product launch.

Even though a positive relationship between the post-launch online WOM period and sales has been found in the literature (Chen and Xie, 2008; Dellarocas, 2003; Li and Hitt, 2008; Miller, Fabian and Lin,

2009; Cui, Lui and Guo, 2012) and the new product diffusion theory by Bass (2004) and Rogers (2004) suggests that word of mouth plays a greater role in the growth period than in the introduction stage, recent studies indicate that online WOM can affect product sales early in the product life cycle process (Amblee and Bui, 2008; Dellarocas, 2003; Dellarocas, Zhang and Awad, 2007; Asur and

Huberman, 2010). The results from these studies indicate that online WOM might even have an effect before the start of the product life cycle. There is a lack of evidence of the relationship between pre-launch online WOM and sales. To sum up, our research contributes to the literature because we focus on predicting the sales of technological product using the pre-launch online WOM on Twitter.

This thesis will investigate the opportunities of online WOM on Twitter to predict sales in a business environment. Specifically this research will try to find a relationship between pre-release Twitter data and the initial sales of new car models. So the research is intended to test the forecasting capabilities of Twitter in a real world situation. This study focuses on the car industry for two reasons. Firstly, the topic of cars is of considerable interest among the Twitter community. Tweets about the car industry accounted for roughly 1.5% of total tweets in the Netherlands in 2013. Secondly, the real world outcomes from car sales can be easily observed due to the new car registration system in the

Netherlands. In the next chapter a literature study has been conducted which starts off with the definitions of the key concepts of the research topic. The literature review will be continued by highlighting some of the key articles on the topics of word of mouth, forecasting with social media and predicting car sales. After the literature review we will present a conceptual framework and the hypotheses. In the subsequent results section we demonstrate the outcomes of a descriptive,

8

correlation and regression analysis. This research report ends with the discussion, managerial implications and suggestions for further research.

This thesis contributes to both the scientific as the practical world. Our results support or reject the possibility to predict real world outcomes with tweets. Marketers or managers can use the results of this research as valid argument for decision making on a strategic level. For instance, managers can use the findings of our research as a reason for his instructions to the marketing department. If a managers is interested in knowing the future market share of a product he can assign his department to analyze the Twitter messages about their own products and those of their competitors. If necessary they can initiate a volume creating campaign on Twitter. Moreover, for online marketing agencies it is a confirmation of their activities. They can scientifically explain their clients the importance of being active on social media and monitoring their competition. As mentioned before this study contributes to the scientific world because we focus on predicting the sales of technological product using the pre-launch online WOM on Twitter. Most importantly a positive results would mean another confirmation of the predictive power of Twitter.

9

Literature review

This literature section will start off with the definition of the key concept of the research topic. The literature section will be continued by highlighting some of the key articles related to the research questions. Our research touches multiple research fields namely car sales prediction, effect of advertisement, worth of mouth, web data based forecasting, Twitter and in particular predicting with

Twitter. The key findings of this literature review are summarized in Appendix 3: Table 1.

Definitions

Two important concepts of this study are word of mouth and Twitter volume. According to Dichter

(1966) word of mouth in marketing can be defined as the passing on of information between a non- commercial communicator and a receiver concerning a brand, a product or a service. A non- commercial communicator is someone who is not rewarded for passing through the information.

Twitter volume is the amount of tweets about a certain topic. By practitioners, Twitter volume is often referred to as Twitter buzz. According to Thomas (2004) Twitter buzz is the interaction of consumers and users of a product or service which intensify or changes the original marketing message of a brand. Buzz can be an emotion, energy, excitement, or anticipation about a product or service and can either be positive or negative. Originally, buzz referred to oral communication but the emerge of Twitter and changed this. The social networks are nowadays the dominant communication channels for marketing buzz. The source of the buzz can be the intentional marketing activities or it can be the result of independent events that reaches the larger public through traditional or social media.

Predicting car sales

Some challenges of forecasting a really new product is described by Urban Weinberg and Hauser

10

(1996). The really new product in this article is an electric vehicle by General Motors. They describe how the automobile manufacturer combined a new measurement methodology, called information acceleration, with existing marketing research methods to make forecast about potential sales. The multimedia virtual-buying environment conditions respondents for future situations, simulates user experience and encourages consumers to actively search for information on the product. The basic idea behind information acceleration method is to place the consumers in a virtual buying environment that simulates the information that is available to the consumer at the time he or she makes a purchase decision. The method to generate the forecast for the electric vehicle combines measurements on factors which are believed to influence consumer buying choice. In the information acceleration measurement Urban Weinberg and Hauser (1996) use showroom visits, advertising, magazine articles and word-of-mouth as sources of information that consumers access in their search for information about a new car. These information types can to a certain degree, depending on degree of access and quality, influence consumer choice and subsequent potential sales. In addition the authors appoint other influencing factors like governmental regulation, offerings of other brands, environmental situations, driving experience, reviews and technological and infrastructure development.

Newman and Staelin (1972) did research on the pre-purchase information seeking of consumers looking for a new car or major household appliance. Newman and Staelin argue that knowledge of consumer information seeking is fundamental to understand buyer behavior and planning marketing communications and retail distribution. The article tries to identify the main influences on information seeking. The study examined 653 households which had bought a new car or a major household appliance. Newman and Staelin performed two multivariate techniques with a information seeking index as the dependent variable. The information seeking index is merely constructed by identifying the sources of information used by the different households. Identified sources for information seeking are categorized in friends or neighbors, books, pamphlets, magazine or articles; newspaper or magazine advertisements; television commercials and other

11

sources, such as repairmen or mechanics. Newman and Staelin find support for their hypothesis that purchase and use of a product results in learning which later influences buying behavior. Other interesting findings are that half of the buyers thought mainly of only one brand at the outset of the decision process. Moreover the results indicate that many buyers engage in little information seeking, even though enough information is accessible, suggesting a significant selectivity of search.

However the authors state that this does not mean the buyer is badly informed. The buyer may have started with what he regarded as sufficient knowledge. Also, counts of types of sources and types of information say little about the quality and quantity of information seeking search. The results that many buyers engaged in little information seeking is consistent with a finding reported earlier by

Newman and Staelin (1971). According to the finding of this report, half of the buyers of new cars and major appliances had purchase decision times of one or two weeks. In this previous research the amount of information seeking was positively related to decision time, but the data also showed that experienced buyers were able to collect a substantial amount of information in a short time.

Bennett and Mandell (1969) studied the pre-purchase information seeking behavior in terms of repeat purchase data for new car purchasers. They found that experience alone, measured by the number of times the choice decision has been faced, appears not to affect information seeking behavior. Meanwhile, positively reinforced past choices, measured in aggregate or in sequence decrease the amount of pre-purchase information seeking in which consumers engaged. This study supports the contention that brand choice behavior is a form of human behavior subject to learning through reinforcement. Consumers who are loyal to a brand are either more susceptible to the brand’s marketing or are harder to reach by competing brands. Bennett and Mandell identify multiple sources of information in the new car purchase decision process. The sources used are consumer reports, dealer visits, expert opinion, friends opinion reading brochures, discussion with spouse, auto show, advertisement, new articles, discussion with children.

12

Koppel, Charlton and Fildes (2006) have focused on the reasons why people buy cars. Specifically they examined the importance of vehicle safety in new vehicle purchase process for fleet vehicles.

They found that safety is generally not the primary consideration in the vehicle purchase process and safety is outranked by factors such as price and reliability. The full list of factors identified in the vehicle purchase process by Koppel, Charlton and Fildes (2006) are warranty, type, price, style, safety, running costs, re-sale, reputation, reliability, price, performance, model, fuel, country and comfort.

Advertisement

The most critical questions asked in the field of advertisements is how much and where to spend the marketing budget. An important aspect in this is how long advertisement can affect sales. Clarke

(1976) has performed a survey to research the longitude effect from advertisement on sales

Following a survey on published econometric literature Clarke found that the studies yield conflicting estimates of the duration interval. The examination of the published articles leads to the conclusion that 90% of the cumulative effect of advertising on sales of mature, frequently purchased, low priced products occurs within three to nine months of the advertisement. These results show strong support for the hypothesis that advertising effect on sales lasts for months rather than years.

Heyse and Wei (1985) build upon the finding of Clarke (1976) of a relationship between sales and present and past advertisement. In their article they explain that the lagged advertising effects on sales may result from a delayed response to the marketing effort. Moreover, they pose that advertisement can lead to new customers transferring from the competition and an increase in demand from existing customers. In addition, advertising may have a cumulative effect on total demand. Heyse and Wei also build on a possible link between present advertisement and past sales recognized and discussed in earlier literature like Schmalensee (1972). According to Heyse and Wei an important reason for the aforementioned link to exist is because advertising budgets are often set

13

as percentage of sales. Heyse and Wei (1985) construct a joint time series model to understand the dynamic relationship between sales and advertising and improve forecasting. The joint multiple time series model demonstrates that sales and future advertising were mostly related in one period.

Specifically, a positive relationship existed between sales and future advertising at one period. The relationship between advertising and future sales beyond the current period was found to be weak.

Web data based forecasting

In the following paragraphs we discuss literature which covers a more broad topic of forecasting with web data. The theory shows the various possibilities to predict real world outcomes based on internet datasets. In the field of predictions with internet data there are many different streams. This is due to the various origins and types of data. The different flows in web data based predictions also originates from the variety in research disciplines. For instance web based prediction is used in psychological, health and social science. Digital platforms provide a diverse, rich and unlimited set of data which offers the media companies the unprecedented opportunity and ability to track and model the behavior of individual users over a certain time period. Researchers use the data to help the firm to better understand consumers and to improve management decision making. In the long- term this will improve business practices (Feit et al., 2013).

Goel et al. (2013) explored how web search can predict collective future behavior days or even weeks in advance Goel and Goldstein (2014) used connectivity data generated by social media. Across different sectors, Goel and Goldstein found that social data are informative in identifying individuals who are most likely to undertake certain activities. Also on a larger scale scholars are examining the opportunities of big data forecasting. Ettredge, Gerdes and Karuga (2005) proved that web search statistics can predict macro-economical variables, such as unemployment rate.

Next to this business related topics, scholars examined the use of predictive analytics in health sciences. Cooper et al. (2005) found a correlation between The Yahoo! search activity associated with

14

specific cancers and their estimated incidence, estimated mortality, and volume of related news coverage. Polgreen et al. (2014) studied the application of internet searches for influenza surveillance. The authors models predicted an increase in communities positive for influenza one to three weeks in advance of when they occurred.

Besides in the business environment and in healthcare, web data based forecasting is also used by scholars researching prediction of stock markets and politics Antweiler and Frank (2004) found a correlation between activity on internet forums and stock volatility and trading volume using automated linguistics methods. Gilbert and Karahalios (2010) and Choudhury et al. (2010) used posts to predict stock market behavior. Williams and Gulati (2008) found, using a multivariate analysis, that the number of Facebook supporters is a valid predictor of electoral success. Veronis

(2007) shows that a simple count of candidate mentions in the press can be a better predictor of electoral success than commonly used election polls.

Word of mouth

Dichter (1966) examined the relationship between the successful everyday WOM recommendations and effective advertisement. He discovered that the two concepts are closely related to each other.

According to Dichter this emphasizes the new role of the advertiser as that of a friend who recommends a tried and trusted product. In addition, Dichter identifies WOM as an influencer that complement mass media advertising. There is a symbiotic relationship between the impersonal and the personal, or the formal and informal, avenues of communication. This relationship is moderated by the risk factor of buying a new product. For instance, when a consumer considers buying a new car the economic risk are much higher compared to buying toilet paper. Dichter demonstrates that if the consumer risks are high, WOM recommendations are one of the strongest influencers on product purchase decisions of consumers. Dichter further points out that there is a market of influencers who can be reached and influenced by advertising in existing specialized publications,

15

such as profession magazines or by the appropriate approach. The rest of the consumers are influenced by these aforementioned influencers through WOM.

Cui, Lui and Guo (2012) examined the effect of online reviews on product sales for consumer electronics and video games. To research this effect the authors did an analysis of panel data of 332 new products from .com over nine months. Cui, Lui and Guo identified and measured three characteristics of consumer product reviews. These three characteristics are volume, valence, and dispersion of the consumer reviews. The reason behind the measurement of the volume of product reviews is that discussions about a product in online forums lead to increased awareness among consumers. The valence is the average ratings or the fraction of positive and negative opinions.

Cui, Lui and Guo found that that each of the metrics of online reviews all significantly affects consumer purchases, but together these metrics have an tremendous effect on new product sales.

The effects tend to be stronger or weaker depending on the product category. In other words they found important contextual variables that moderate the influence of online reviews. Specifically, the authors make a distinction between an experience product or a search product.

Search products are goods that consumers can evaluate by specific attributes before purchase, such as the technical or performance aspects of a product. Consumers assessing a search product are more likely to use a systematic decision making process. On the internet there is a tremendous amount of information available on product attributes, functions, and performances. And even more important, the evaluation of products by other consumers is prominently displayed. Consumers can easily access such information. Cui, Lui and Guo (2012) found that the valence of reviews has a great effect on evaluations and purchase decisions for search products. This indicates a strong persuasive effect of product ratings for more complex products and consumers experiencing a high level of involvement. Moreover the researcher found that the effect of the volume of page views by readers

16

is significant for both experience and search products, but the volume of page views has a greater influence than the volume of reviews only for search products. According to Cui, Lui and Guo this latter suggests the significant role played by followers or latecomers in this product category.

Experience products require feeling or experiencing. Experience products are difficult to describe using specific attributes and may induce different experiences across consumers. Evaluations of experience products by consumers tend to be very personal and less indicative of the quality of a product. Moreover, in the online environment consumers cannot directly feel the products or experience product attributes. Consequently, consumers considering an experience product rely more on extrinsic affective cues, such as the popularity of the product. Cui, Lui and Guo (2012) have found this to be true. They found that experience products are more subject to the influence of the volume of reviews. These volumes signal the popularity of a product and an awareness effect from the large volume of reviews.

Furthermore the authors did research on the effect of online product review over time. They found that the volume of reviews has a significant positive effect on new product sales in the early period of a product life cycle. This effect decreases over time, which according to Cui, Lui and Guo suggests the significant role played by early reviews. They also found that the percentage of negative reviews has a greater effect than that of positive reviews, confirming the negativity bias.

Recently, scholars has put a lot of effort in researching the importance of online WOM on the success of movies. These studies rely on metrics such as the number of message, votes and discussion pages dedicated to a film. Liu (2006) studied online WOM data from the website Yahoo! Movies in order to study the dynamics of the online discussion. Lui (2006) found that while during pre-launch, valence is important, once the movie has launched online WOM volume becomes by far the best predictor of sales. The author concludes that online WOM is a good predictor of success, but that the link is not causal. WOM reflects mainly the media exposure of the film. Asur and Huberman (2010) found a

17

similar result using Twitter data. They show that the number of tweets created around a movie can predicts box office revenues. We discuss this article more thoroughly further in this literature review.

Holbrook and Addis (2007) show that online WOM increases with a film's budget and that WOM positively impacts the revenues of the film. Online WOM is an approximation for the media exposure.

Twitter

Recently, Twitter has attracted scholarly interested from various fields and for different reasons.

Jansen et al. (2009) researched the usefulness of Twitter for marketers and brand owners. They found use for the analysis of Twitter chatter to monitor digital word of mouth in the area of product marketing. Jansen et al. (2009) also found in their study that one fifth of a random sample of tweets contained mentions of a product or brand and that an automated monitoring tool was able to distinguish significant differences of customer attitude of a user towards a brand.

Another stream of research on Twitter focuses on understanding its usage and community structure

(Honeycutt and Herring, 2009; Huberman, Romero and Wu, 2008; Java et al., 2007) which provides a general understanding of why and how people use the micro blogging service. Briefly, they found that the intentions and intensity of Twitter usage differs considerably. Huberman et al. (2009) analyzed the social interaction on Twitter and found that the driver of Twitter usage is a limited hidden network among friends and followers. The scholars conclude that most of the interaction links are meaningless.

Asur and Huberman (2010) describe Twitter as an extremely popular online micro blogging service with very large user base, consisting of several millions of users. According to the scholars, tweets normally consist of personal information about the users, news or links to content such as images, video and articles. A retweet is a post originally made by one user that is forwarded by another user.

18

Retweets are a way of disseminating interesting posts and links. According to Asur and Huberman,

Twitter has attracted lots of attention from organizations because of the huge potential it provides for viral marketing. Organizations are using Twitter to advertise products and spread information to stakeholders. Due to its huge audience, Twitter is even increasingly used by news organizations to filter out the latest news updates.

Forecasting with Twitter data

Next to the general understanding of Twitter, other researchers took an interested in its prediction power and potential application to other areas. This paragraph will discuss a range of applications of the predictive power of Twitter. Achrekar et al. (2011) found that the volume of flu related tweets is highly correlated with the number of fever cases reported by using auto-regression models. Lampos and Christiani (2010) also suggest the possibility to use Twitter to track the spread of epidemic diseases. Tumasjan et al. (2010) analyzed Twitter messages mentioning parties and politicians prior to the German federal election 2009 and found that Twitter is indeed used as a platform for political debate. The amount of tweets concerning a political party or persons reflects voter preferences and comes close to traditional election polls.

Next to varied use of the predictive power of Twitter mentioned above we are interested in the application of Twitter data in the business environment. According to Asur and Huberman, social media can also be considered as a form of collective wisdom. They decided to investigate the power of Twitter in predicting real world outcomes. Asur and Huberman constructed a linear regression model for predicting box office revenues of movies in advance of their release. The model uses the rate of chatter extracted from a total of almost three million tweets. The model predictions outperformed in accuracy those of the Hollywood Stock Exchange. They found a strong correlation between the amount of pre-launch attention a movie has and its ranking in the future. Moreover

19

they analyzed the sentiments present in tweets and demonstrated their efficacy at improving predictions after a movie has released. Asur and Huberman conclude by arguing that the used method can be extended to a large variety of topics, ranging from the future rating of products to agenda setting and election outcomes. Moreover they state that their work shows how social media expresses a collective wisdom which, which when properly tapped, can yield an extremely powerful and accurate indicator of future outcomes.

Zhang, Fuehres and Gloor (2011) published a paper in which they describe early work trying to predict financial market movements such as gold price, crude oil price, currency and stock market indicators by analyzing Twitter posts. They collected Twitter feeds for five months capturing a large set of emotional retweets originating from within the USA. They extracted six public opinion time series containing the keywords “dollar”, “$”, “gold”, “oil”, “job” and “economy”. They found a

Granger-casual relationship between the keywords, except for “$”, and certain market movements.

Their results show that these keywords are correlated to and predictive of financial market movement. The study concludes that emotional Twitter outburst on a topic on one day, the volume of economic topic retweeting, is a fairly accurate predictor of how the corresponding stock market will be doing the next day.

The previous studies provide reasons to believe that Twitter offers a database suitable for analysis and subsequent prediction of sales, crime, flue trends, revenues, stock markets and more. In a business context, Twitter data analysis can offer a lot of statistics about how a brand is performing on the internet.

20

Conceptual framework and hypotheses

Urban, Weinberg and Hauser (1996), Dichter (1966), Bennett and Mandell (1969) and Newman and

Staelin (1972) have identified WOM as an important source of information for consumer in the decision making process. Therefore, WOM can be used as important source for new car sales prediction. Newman and Stealin (1971) discovered that the consumer information seeking process for the decision processes for a new car seems to be short and the process includes only a limited number of sources. Moreover, they found that half of the buyers think mainly of only one brand at the outset of the decision process.

The following literature identified WOM as a key source in the aforementioned consumer decision making process. Dichter (1966) defined WOM, in marketing, as the passing on of information between a non-commercial communicator and a receiver concerning a brand, a product or a service.

Recent studies on the diffusion of innovations have found that the volume of WOM correlates significantly with consumer activity and market outcome (Anderson, 2003; Neelamegham and

Chintagunta, 1999). Grewel, Cline and Davies (2013) found information from WOM to be more powerful than printed information, because WOM is considered more credible and valuable. They argue that informal discussions among consumers can influence the popularity of a product, particularly for new products. So theoretically, there is support for the effect of WOM on new product purchase decisions and sales.

Cui, Lui and Guo (2012) consider online WOM even more influential than offline WOM because of its speed, convenience, wide reach, and the lack of interpersonal pressure. Furthermore, online marketing managers today can archive WOM interactions from online forums in databases. These databases lead to opportunities to estimate the effects of consumer WOM directly and perhaps more accurately.

21

To better understand the effect of online WOM on new product purchase decisions and sales, we take a closer look at the diffusion of innovation literature developed by Bass (2004) and Rogers

(2004). The diffusion of innovation literature deals with the adoption of innovation in societies at the aggregate product category level (Anderson, 2003; Neelamegham and Chintagunta, 1999). But the diffusion of innovations theory has also been found very useful in analyzing the role of online WOM in new product growth (Cui, Lui and Guo, 2012). A key concept in the literature of the diffusion of innovations is the Bass model. This model by Bass (2004) proposes that the early adopters and innovators in the early stage of the product life cycle are affected by external influences, such as mass media.

According to Bass (2004), the external influences are the reason why innovators are turned into the main adopters of new products in the first period of the product life cycle. In the later growth and maturity periods of the product life cycle, the adoption of new products accelerates due to internal influences, such as WOM. These influences result in the adoption of the product by followers and latecomers. Thus, the diffusion of innovation theory considers early adopters as the initial main driving force of the dispersion process. In this diffusion process, WOM has been recognized as a key channel of communication, particularly among the early majority and late majority, who tend to follow the innovators and early adopters (Bass, 2004;Rogers 2004).

The diffusion of innovations literature indicates that WOM plays an increasingly important role in new product adoption during the growth stage of the product life cycle. The diffusion of innovation literature is supported by Dichter (1966). This scholar argues that there exists a ready-made market of influencers who can be reached and, in turn, influenced by advertising. The rest of the consumers are influenced by these influencers through WOM. By extending the diffusion theory to the online world you would expect online WOM to exert minimum effect in the early introduction stage of a

22

new product, but greater effect in the growth period of the new product.

However, recent studies suggest a change in the role of WOM in the product life cycle due to the rise of the internet. For instance, Phelps et al. (2004) argue that the speed, the convenience and large reach of digital interactions is changing the dynamics of the various industries in which WOM has traditionally played an important role, especially for new product launches. In the current digital world, WOM about products and brands can reach consumers instantaneously. Consumers no longer have to wait for known influencers, friends or family to give them interpersonal advice about a product.

Other research has been looking for more practical evidence of a change in the role of WOM and their findings suggest that online WOM is having an early impact on new product sales. To start with the research performed by Cui, Lui and Guo (2012). This study supports previous findings by Amblee and Bui (2008), Dellarocas (2003) and Dellarocas, Zhang and Awad (2007) that online WOM can substantially increase the initial sales of new products, exaggerate product growth and cause the reversal of sales growth when maturity sets in. Cui, Lui and Guo (2012) have exposed this influence by researching the early effect of online WOM on new product sales. Also, multiple other studies confirm this early effect of online WOM on product sales. These studies examined the effect of early online WOM on books, movies and video games revenues (Holbrook and Addis,2007;Chen and Xie,

2008; Dellarocas, 2003; Li and Hitt, 2008; Miller, Fabian and Lin, 2009; Lui, 2006). The aforementioned studies suggest that online WOM communication is an important source of information for consumers planning to purchase new products. The user generated content, such as

Tweets, helps costumer make informed decisions about purchasing new products. And so, online

WOM has become an important driver and predictor of new product sales.

The key findings of the studies such as Cui, Lui and Guo (2012), contradict earlier findings of Rogers

23

(2004) and Bass (2004). These new findings propose an early effect of online WOM on the product life cycle. This early effect causes new products to experience an tremendous growth in the early stage of the product life cycle as a result of online WOM. But on other end, these growth effects are not extended in the next phase of the product life cycle and tend to fade out over time. From a managerial perspective, these findings suggest that online WOM shortens product life cycles and forces organizations to rethink their pre and post-launch marketing strategies.

The early effect of online WOM has also been confirmed by Asur and Huberman (2010). According to

Asur and Huberman (2010), social media expresses a collective wisdom which, when properly used, can yield an extremely powerful and accurate indicator of future outcomes. Therefore, Asur and

Huberman (2010) studied an even earlier effect of online WOM. They researched the effect of online

WOM, specifically the Twitter volumes, in the pre-launch period of a product. They found a positive correlation between the volume of pre-launch online WOM and the box-office revenues of a particular movie at the release weekend. These findings suggest an even earlier effect of online

WOM than Cui, Lui and Guo (2012). This thesis continues building on the theory that online WOM can effect, and thus predict, product sales even before the official launch of the product.

Car model tweets

In this thesis we examine the change of the role of online WOM in the product life cycle and the predictive abilities of online WOM the pre-launch phase of a new product. To continue building on the findings of Asus and Huberman (2010), we use the pre-launch online WOM, on the social network called Twitter, to predict new product sales. Twitter has been identified by Asur and

Huberman (2010) as one of the fastest growing online WOM networks. The micro-blogging network has experienced a burst of popularity in recent years leading to a huge user base, consisting of several tens of millions of users who actively participate in the creation and propagation of content.

According to Asur and Huberman (2010) Twitter has attracted lots of attention from organizations

24

because of the huge potential it provides for viral marketing. Organizations are using Twitter to advertise products and spread information to stakeholders. Jansen et al. (2009) found that one fifth of a random sample of tweets contained mentions of a product or brand.

The pre-launch Twitter data is used to predict new car sales. This thesis uses this product category because of the contribution of Cui, Lui and Guo (2012). They studied the effect of various metrics of online WOM on new product sales. They found that volume, valence and views of online reviews all significantly affect new product sales, but the effects tend to be stronger or weaker depending on the product category. In general, product type influences search behavior of consumers and the use of information sources, which in turn influences their choices. As mentioned, this research focuses on forecasting car sales of new models. In this thesis we position new cars in the search category.

Although cars are hard to categorize in explicitly a search or experience product because certain car characteristic, such as the driving experience, are very personal and hard to describe.

Cui, Lui and Guo (2012) showed that online WOM can be measured by different metrics, such as the volume or sentiment. In our research we use tweet volumes as metric. This is mainly because tweet volume has demonstrated its predictive capabilities in previous research. Asur and Huberman (2010) found a relationship between tweet volumes and box-office revenues. Zhang, Fuehres and Gloor

(2011) argue that the more positive tweets about a financial market the higher the chance of a financial up rise. Lui (2006) shows that online WOM about a film concentrates on the weeks before and after the release day, and decline steadily thereafter. Lui used data retrieved from the website

Yahoo Movies. The author concludes that online WOM volume is a good predictor of success.

Achrekar et al. (2011) conclude that the volume of flu related tweets are highly correlated with the number of fever cases reported. The research of Tumasjan et al. (2010) indicates that the amount of tweets concerning a political party or persons reflects voter preferences and comes close to traditional election polls. And as mentioned earlier, Cui, Lui and Guo (2012) also found a positive

25

relationship between volume of online products reviews and product sales. The rationale behind this effect of volume of product reviews is that discussions about a product in online forums lead to increased awareness among consumers. Although we acknowledge that not all of the above mentioned studies are about predicting new product sales, the findings do show that the metric online WOM volume is a good predictor of real world outcomes.

To briefly summarize, prior research on car sales prediction indicate that WOM is an important influencers in the consumer decision process (Urban Weinberg and Hauser, 1996; Newman and

Staelin, 1972). Furthermore, previous literature shows that online WOM effects early sales of new products. The information seeking of consumers for the decision processes for a new car seems to be short and include only a limited number of sources. In addition, Twitter has grown to be the online platform for WOM about particular brands. But more important, Twitter volumes have proven to be a good predictor of future consumer behavior in various field. This thesis is interested in the relationship between tweets volumes and the initial new car sales.

H1: New cars that are more discussed on Twitter during the pre-launch phase, sell better in the post- launch period.

Brand tweets about car

Besides the total tweet volume about a particular topic we are also interested in how online WOM is actively stimulated by the brands. According to Thomas (2004) Twitter volumes arise from the interaction of consumers and users of a product or service which intensify or changes the original marketing message of a brand. These original marketing messages are distributed through the marketing channels of the brand in the pre-launch phase. One of these channels is the brand’s

Twitter account. Asur and Huberman (2010) determined whether movies that have greater self initiated publicity, in terms of Tweets with linked URLs, perform better in the box office. They found

26

the correlation between the URLs and retweets with the box-office performance to be moderately positive. However, they also found that these features are not very predictive of the relative performance of movies. Just like Asur and Huberman (2010) this study is interested in studying how attention and popularity are generated for products by the various brands, and the effects of this attention on the real world performance of the products. Preliminary exploratory research on the pre-launch period of new cars on Twitter showed similarities with the finding of Asur and Huberman

(2010). Prior to the release of a new car, brands generate promotional information in the form of

Tweets with trailer videos, news, and photos. Brand tweets prior to the release of products consist primarily of such promotional campaigns, tailored to promote information distribution via online WOM on a large scale. Due to the promotional character of those brand tweets, these are expected to have a large positive influence on the online WOM and eventually on the initial sales of new car models. Following the research of Asur and Huberman (2010) this thesis examines the relationship between the volume of such promotional tweets published by the brand about a new car model and the initial sales.

H2: Brands that initiate higher publicity in the pre-launch phase through Twitter have higher initial new car sales.

Mass media

Another way through which marketing messages are distributed to stimulate online WOM and sales for new products is via advertisement. As already mentioned the diffusion of innovation literature discussed by Bass (2004) suggests that in the early stage of a product life cycle , innovators are mainly affected by mass media, and after using the new products, they pass their opinions to latecomers via WOM channels. As discussed earlier, the recent study by Asur and Huberman (2010) suggests an earlier presence and influence of online WOM. Therefore, this thesis examines the relationship between pre-market advertisement and pre-launch online WOM.

27

H3: Higher brand investment in pre-market mass media promotion for new car leads to more pre- market Twitter volume.

And in addition, Urban, Weinberg and Hauser (1996), Bennett and Mandell (1969) and Newman and

Staelin (1972) identified advertising as an important source of information for consumer decision making. The relationship between advertisement and sales has been examined by Clarke (1976) and

Heyse and Wei (1985). Clarke (1976) found that the positive effect from advertisements on sales occurs within three to nine months. Heyse and Wei (1985) found similar results and pose that sales and advertising are more strongly related as time periods overlap. They specifically found a strong connection between advertising budgets and current sales. The advertising budgets are often set as percentage of sales. These two articles suggest a short-term effect of advertisement on online WOM and sales. Therefore this thesis is interested in the relationship between the pre-launch mass media investment and the post-launch initials car sales. And, since sales Clarke (1976) and Heyse and Wei

(1985) found a stronger relationship between sales and advertisement in the same time period, we also examine in the relationship between the post-launch media investment and the post-launch new car sales. The results of the measurements of these relationships can serve as reference material for the relationship between pre-launch Twitter volumes and initial car sales. Similar to how

Asur and Huberman (2010) compared their model build from Tweet volumes to market-based predictors to indicate its predictive strength.

H4a: Higher brand investment in pre-market mass media promotion leads to higher initial new car sales.

H4b: Higher brand investment in post-market mass media promotion leads to higher initial new car sales.

28

When analyzing the influence of tweets volumes on sales two influencing factors have to be taken into account. An important factor is the popularity of a brand. Some car brands might be more popular in the Netherlands, which can strongly influence the sales of a particular brand. One major reason for popularity is loyalty. Newman and Staelin (1972) found that the purchase and use of a product result in learning which later influences buying behavior. Furthermore the study has to account for the fluctuations in car sales throughout a year. On average Dutch people are known for buying a car just before the summer holidays when they receive their extra holiday allowance. Other fluctuations emerge due to changes in governmental law, for example in the area of emission taxes.

These fluctuations can influence the measurements of initials sales of new cars.

29

Method

To answer the research questions whether Twitter volumes in the pre-launch phase forecast initial car sales of new cars a database analysis is conducted. We used various databases to gather information about Twitter volumes, media expenditures and car sales.

Twitter data.

An online monitoring tool called Buzzcapture is used for the gathering of Twitter volume data.

Buzzcapture is a web based tool which gives an organized display of all Dutch Twitter data provided by the Twitter API. The tool helps you to get insights into a specific topic discussed on Twitter. The required datasets for this research are the brand and car model tweet volumes. This information will be gathered using search queries within the tool. As an example, for the model search of Volkswagen

Golf, we use the full written name of the car model as a search query. This means a Tweet has to contain the words Volkswagen and Golf otherwise the message won’t be included in the data. For the brand search, we use the full written name of the brand (Appendix 1: Table A1: Search queries

Twitter). These search queries work as a filter in the program. The data gathered from Buzzcapture gives a detailed overview of the brand and car model tweet volume per month from January 2012 until November 2014.

Car sales data

The car sales information is gathered using the secondary data from the BOVAG. The source of the data is the RDW, the public service provider in the mobility chain (RDW, 2014). When a new car is sold the new owner is obligated to register the car license plates to the RDW database. A license plate number is an identifier for vehicles. With the license plate the RDW can identify who is liable for a vehicle. Furthermore, with the car registration system the RDW can keep track of the sales of new cars. The car sales data from the RDW is processed and edited by the RDC, which is the data centre of BOVAG. BOVAG is a trade organization of more than 10,000 entrepreneurs engaged with

30

mobility (BOVAG, 2014). The data from BOVAG gives a detailed overview of the car sales in the

Netherlands per model per month for the period of January 2013 until October 2014 (BOVAG, 2014).

Media expenditure

The information about the media expenditure of different brands used in this thesis is gathered from the Adfact database. Adfact is a marketing agency which collects, analyzes and sells information concerning the mass media investment of companies in the Netherlands. The database consist of the daily records of all widespread market advertisement. These records include commercials on television and radio. And advertisement in cinema, newspaper and magazines. Furthermore, it includes outdoor displays. The records are accompanied with the estimated cost of the advertisement based on current market prices and connected to the related brands and models. The information does not give exact brand media expenditure data but gives a good approximation.

Moreover, the expenditure data of the various brands are well comparable with each other. It is important to notice the information does not include the expenditure on online display like banner expressions. The Adfact data gives a detailed insight into the media expenditure of automotive brands in the period of January 2013 to November 2014.1

Sample

This research uses all Dutch Twitter data provided by the Twitter API. We analyzed a total amount of

290.308.950 tweets for this research. Buzzcapture gives the opportunity to filter out the necessary information using search queries on a particular topic in a specific time period. Furthermore, this study collects sales data and related brand media expenditure from seventeen new car models in

2013 and 2014 in the Netherlands. In order to provide clean results, we only use information on family cars with a traditional engine. For example, this excludes two seaters and electric cars.

1 To check the reliability of the Adfact data, the marketing expenditure information from Adfact is compared with data obtained from one of the brands in our research. The results show an alignment of information. 31

Another requirement is that a new cars needs to be sold more than a hundred times in 2013 or 2014.

This study uses these new cars, because they generate enough Twitter volume and demand for media investment from the related brand to examine the proposed relationship between Twitter volume, media expenditure and car sales. We use data from the Netherlands, because Twitter is a commonly used social media platform in the Netherlands (Azevedo, 2011). And also because of the high rate of car owners that comes from high prosperity of the country (Worldbank, 2011; Legatum institute, 2014). This makes the Netherlands a good country for researching the predictive capabilities of Twitter. Last and foremost the necessary data is available. The Twitter volumes, media expenditures and car sales data is gathered for seventeen new cars (Table 1: New car models introduced in 2013 and 2014).

Table 1: New car models introduced in 2013 and 2014

Number Models Available in the Netherlands 1 Renault Capture April 2013 2 Peugeot 2008 May 2013 3 Opel Adam January 2013 4 Kia Carens March 2013 5 Peugeot 108 June 2014 6 Volkswagen Golf Sportsvan May 2014 7 Citroen C4 Cactus June 2014 8 BMW 3 Serie Gran Turismo June 2013 9 Mercedes-Benz CLA-klasse March 2013 10 Fiat 500L Januari 2013 11 Renault Zoe March 2013 12 Seat Toledo March 2013 13 BMW i3 November 2013 14 Opel Cascada April 2013 15 Mini Paceman March 2013 16 Mercedes-Benz GLA-klasse March 2014 17 Porsche Macan April 2014

The analysis starts with a descriptive graphical analyses of the Twitter volumes, media expenditures and car sales data. The remainder analysis of this thesis resembles the study of Asur and Huberman

(2010) in which they investigated the effect of pre-launch Twitter volume on box office revenues for

32

movies. Asur and Huberman (2010) performed a correlation analyses on the tweet-rate a week prior to the release and the box office revenues in the opening weekend. The tweet-rate is defined as the number of tweets referring to a particular movie per hour. Subsequently they constructed a linear regression model using least squares of the average of all tweets for the 24 movies considered over the week prior to their release. To investigate whether Twitter volumes can predict the early car sales for new car models and whether media expenditure has an influence on the Twitter volumes and initial sales, we make use of various variables in a correlation analyses. Furthermore, we construct multiple linear regression models. (Saunders and Lewis, 2011).

Variables

The dependent variable is the total amount of cars sold in the six months after each model is available in the Netherlands. The dependent variables is defined as follows.

Total-CarSales model = Total number of car sales within six months after the release of each new car model

The six months period is determined after consideration of the various delivery times of new cars and the limitations of the provided car sales data. The monthly new car sales data from the BOVAG data is determined by the moment of registration of a new car at the RDW2. This moment of registration aligns with the moment of delivery of a new car to the consumer. But this date is not the same as the moment of the order of a new car. The delivery time for a new car in the Netherlands can strongly fluctuate from direct delivery to multiple months depending on the car model, the brand and the place and time of the order. Also, in some cases the consumer can pre-order the car. There is no reliable data available about the average delivery times of new cars. The six months periods enables this thesis to examine a large percentage of the initial car sales of a new car model.

2 The registry of a new car license plate at the RDW is mandatory in the Netherlands. A registration plate is an identifier for vehicles. The register keeps track of who is responsible for a vehicle. 33

The analyzed sales period is limited by the time period of car sales data as provided by BOVAG, the available Twitter data and the date of the product launch. This research has only access to Twitter data from 2013 and 2014 and therefore can only use car models from these years. To include a high amount of new car models in this thesis, the post launch period cannot be long. This post-launch period strongly limits the amount of new cars which can be included in this thesis from 2014 as we have only car sales data up to November 2014.

In this thesis we considers four independent variables. The first variable is the total Twitter volume from consumers about each new car model in the period of six months before the car is available in the Netherlands.

(1) Total-tweets-car model = Total number of tweets for a new car model sent by consumers, accumulated over six months prior to the release of each car model

We use a six months pre-market period, because new car model introduction communications from brands vary over this period. These communications can range from the release of pictures of the new model on social media to billboards along the road. Note that this variable represents the total number of tweets from consumers only as the thesis subtracts the number of related brand tweets written by the company’s marketing department from the total amount of tweets. This subtraction is done because the tweets sent out by the brand aren’t part of the online WOM from consumers. Note that the thesis still uses the number of company tweets. These tweets will be analyzed separately as the third dependent variable. The second dependent variable is the total media spending for the model done by the brand in the period of six months before the car model is available in the

Netherlands.

34

(2) Media-expenditure-pre model =Total amount of media expenditure of each brand for the new model accumulated over six months prior to the release of each car model

Again, we use a period of six months. Similar to the first independent variables, we have chosen for this period to capture all the pre-launch brand communications which are spread over this period.

This also applies to the third independent variable. As mentioned before, the third dependent variable is the total amount of tweets written by the brand about the related car model in the period of six months before the car model is available in the Netherlands.

(3) Total-tweets-brand model= Total number of tweets for a new car model sent by the own brand, accumulated over six months prior to the release of each car model

In addition to the independent variables we use two control variables. The first control variable is the popularity of the brand to control for brand effect. The popularity of the brand in this thesis is measured by accumulating the total number of car sales of a specific brand in 2013 and 2014 in the

Netherlands. The higher the sales of the brand in this period, the higher its current popularity. The sales numbers give an indication of the current consumer’s willingness to buy the brand. This variables is divided up in three categories, namely high, medium and low brand popularity, using the difference in total brand sales. In this thesis we use two dummy variables to control for the popularity of the brand.

The second variable is the time period in which a car becomes available. This variable is used to control for the popular periods in which people buy cars in the Netherlands. For instance, the beginning of a new year is often a popular period to buy cars. This is due to changes in emission regulation at governmental level. If the six months post-launch sales measurement period overlaps

35

with a popular buying period there is more chance of higher sales during this period. We control for this time effect by using a dummy variable which controls for a popular time period.

In order to further investigate the relationship between pre-launch tweets and post-launch sales we also examine a shorter time period following the research of Asur and Huberman (2010). Asur and

Huberman (2010) found a positive relationship between the Twitter volume one week prior to the release and the box-office revenues. If we extend their study to our thesis we should focus on the tweets one week before the release of a new car model and the car sales directly after the car becomes available. Our datasets do not provide a period less than a month. In order to keep the time period as short as possible we decided to examine the relationship between the total number of tweets one month prior to release and the total number of car sales of a new car model one month after the release. The dependent variable the total amount of new cars sold one month after the car becomes available. The independent variable is the total volume of consumer tweets one month prior the release date of the new car.

Sales-month-post model = Total number of car sales within one months after the release of each new car model

Tweets-month-pre model = Total number of tweets for a new car model sent by consumers, one month prior to the release of each car model

The last variable in this thesis is used to examine the relationship between the post-launch media expenditure of each brand and the initial sales both accumulated over six months after the release of the new car. We have chosen for the six month period to capture all the post-launch brand communications which are spread over this period. Moreover our data only provides media expenditure data and car sales data to a maximum of six months.

36

Media-expenditure-post model =Total amount of media expenditure of each brand for the new model accumulated over six months after to the release of each car model.

Multiple linear regression

The total of seven independent and dependent variables mentioned above are aggregated values over months. The dependent variables are continues variables. Therefore, to find the predictive value of WOM on Twitter and media expenditure, we construct multiple linear regression models using the variables mentioned above. The regression models are constructed to find a casual relationship between the independent en dependent variables (Vocht, 2007). In addition, we check the coherence of the variables by calculating the correlation coefficients. The correlation coefficients allow us to quantify the intensity of the relationship with the continuous variables (Vocht, 2007). The correlation coefficient is also used to verify for a multicollinearity problem.

The following regression model is constructed to measure the relationship between pre-launch consumer tweets, brand tweets and media expenditure and the post-launch initial sales, all accumulated over a period of six months.

(1) Total-Sales model = β0 + β 1 Total-tweets-car model + β2 Total-tweets-brand model + β3 Media- expenditure-pre model + β4 Control-popular-period + β5 Control-high-brand-popularity + β6 Control- low-brand-popularity + Ɛrror

The next regression model is constructed to measure the relationship between pre-launch consumer tweets and the post-launch initial sales, both over a period of one month.

(2) Sales-month-post model = β0 + β 1 Tweets-month-pre model + β2 Control-popular-period + β3 Control- high-brand-popularity + β4 Control-low-brand-popularity + Ɛrror

37

The last regression model is constructed to measure the relationship between the post-launch media investment of each brand and the initial car sales, both accumulated over six months.

(3) Total-Sales model = β0 + β1 Media-expenditure-post model + β2 Control-popular-period + β3 Control- high-brand-popularity + β4 Control-low-brand-popularity + Ɛrror

The first two regression models only uses Twitter data before the introduction of the new car model as the research is intended to develop a model to forecast the car sales in the pre-launch phase of a new car model. Note that the independent variables can be highly correlated with each other which would cause a multicollinearity problem in a regression model. The thesis checked these correlation but none of the independent variables showed a strong correlation with another (Table 4).

Endogeneity is not a problem here, as there are no car sales yet.

38

Results

This section will show the results of preliminary analyses and a graphical description of the most important variables. Moreover it displays the results of the correlation and regression analysis.

Preliminary steps

Before we can tests the hypotheses with a regression analysis, the raw data is cleaned and a preliminary data analysis is conducted. We need to clean the data because it comes from different sources. The preparation of the data consists of extracting the right periods and values out of the complete datasets, and calculating the variables of interest. As mentioned before, the time windows for the first regression model are six months before and after a new car model becomes available in the Netherlands. Therefore, the total period sums up to a year per new car model. The second regression model uses a time period of one month before and after product launch. The third regression uses a time period of six months after the introduction of a new car. The calculations and preparation of the various variables are done in Microsoft Excel 2013.

The preliminary analysis consists of a frequency and normality check of the seven variables of interest. Moreover a graphical analysis is conducted to give an overview of the relationship between the dependent and independent variables. The frequency and normality check shows the mean of the various variables and indicates that there are no missing values. The data shows that Total-sales,

Total-tweets-car, Total-tweets-brand, Media-expenditure-pre and sales-month-post are substantially or extremely positively skewed (s > 1) (Vocht, 2007). That means that the most frequent scores are grouped towards the left of the distribution. Moreover, Total-sales, Total-tweets-car, Total-tweets- brand, Media-expenditure-pre and sales-month-post show kurtosis (k > 1), a sharp peak of the distribution (Table 2: Frequency and normality check). The results of the normality check show that the thesis has to correct for the skewed data to enable a regression analysis. This correction is done

39

by doing a log transfer of the skewed variables. We have chosen to do a log base 10 transfer. The results of this log transfer can be found in Table 3.

Table 2: Frequency and normality check

Total- Total- Total- Media- Media- Tweets- Sales- sales tweets-car tweets- expenditure- expenditure-post month-pre month-post brand pre N Valid 17 17 17 17 17 17 17 Missin 0 0 0 0 0 0 0 g Mean 914 455 5 109638 1555755 98 50

Std. 1037 519 6 331757 1305771 136 34 Deviation Skewness 2.400 2.045 2.423 3.601 0.927 3.097 0.572

Kurtosis 6.837 4.531 7.620 13.433 0.145 11.024 -0.444

Table 3: Log10 transfers

Total-sales Total-tweets- Total-tweets- Media-expenditure- Tweets-month-pre car brand pre Log10 Log10 Log10 Log10 Log10

Mean 2.75 2.37 0.64 1.67 1.63

Std. Deviation 0.45 0.61 0.41 2.40 0.70

Skewness 0.145 -0.939 -0.179 0.864 -0.965

Graphical description

This descriptive analysis graphically demonstrates the relationship between the dependent variable and the two independent variables, namely consumer tweet volume and brand media expenditure, for the seventeen different new cars (Figure 1: Descriptive graphs for new car models). The brand tweets are not in this graph because the volumes are lower than the rest of the variables. The figures show the same time period for every single car model. The graphs show the period six months before and after introduction, that sums up to a period of twelve months. The grey interrupted line indicates the release date of each model. The black line is the model sales and the grey line is the volume of consumer tweets about the related model. The grey bars represent the media spending to

40

the model by the brand both before and after release of the car. The graphs show a number of similarities and differences. Each total pre-launch period in the graphs show Twitter volumes.

Whereas eight of the seventeen graphs even show chatter right at the start of the pre-launch six months measurement period. This indicates that people talk about the new coming models before their release. Moreover, thirteen graphs show a peak in the consumer tweet volume approximately two to four months prior to the introduction. Whereas for some models, a decrease in consumer tweets can be observed right after the month of introduction. All graphs, except for the Porsche

Macan, do show post-launch media investment by the brand to market the new car model.

There is high heterogeneity in the volumes of tweets, media expenses and sales, the amount of peaks in tweets and sales and the starting point of the online WOM about a specific model. A surprising finding is that there is no clear-cut correlation between the volume of tweets and sales after the model becomes available. Moreover, it is very surprising that no brand, except for BMW for the BMW 3 Serie Gran Turismo and Mercedes-Benz for the Mercedes-Benz GLA-Klasse, engages in media spending before a model becomes available in the Netherlands. These graphs reveal a lot about the relationship between the pre-market mass media expenses and both the pre-market

Twitter volumes and the post-launch car sales which are tested in hypotheses H3 and H4a.

Figure 1: Descriptive graphs for new car models

Renault Capture 800 700 Media-expenditure model 600 (x10000) 500 400 Total-CarSales model

300

Release date date Release Release 200 Total-tweets-car model 100

0

13

13

12 12 13 13 13 13 13

13 13

12

-

-

------

- -

-

jul

jan

jun

okt

feb

apr

sep

dec

aug

mrt

nov mei

41

Opel adam Kia Carens

900 200 800 700 150 600 500 100 400 300 200 50 100

0 0

12

13

13

13 13

12 13 13 12 12 12 13 13 13

12 12 12 12 13 13 13

13 13

12

-

-

-

- -

------

------

- -

-

jul

jul

jan

jan

jun

jun

okt

okt

feb

feb

apr

apr

sep

sep

dec

dec

aug

mrt

aug

mrt

nov

nov

mei mei ‘

Peugeot 108 Volkswagen Golf Sportsvan

1200 400 1000 350 300 800 250 600 200 400 150 100 200 50

0 0

14

14 14 14 14

13 14 14 14

14 14

14

14

14 14

14 14 14

13 14 14 14

13 14

-

- - - -

- - - -

- -

-

-

- -

- - -

- - - -

- -

jul

jul

jan

jan

jun

okt

jun

feb

okt

apr

feb

apr

sep

sep

dec

dec

aug

mrt

aug

nov

mrt

mei

nov mei

Citroën C4 Cactus BMW 3 serie Gran Turismo

700 200 600 180 160 500 140 400 120 100 300 80 200 60 40 100 20

0 0

14 13

14 13

14 13

14 14 14 13 13 13

12 13 14 14 14 14 13 13 13 13

14 13

- -

- -

- -

------

------

- -

jul jul

jan jan

jun jun

okt okt

feb feb

apr apr

sep sep

dec dec

aug aug

mrt mrt

nov nov

mei mei

Fiat 500L Mercedes-Benz CLA-klasse

200 250 180 160 200 140 120 150 100 80 100 60 40 50 20

0 0

13

13

12 12 12 12 12 13

13 13

13

12

13

13 13

12 13

13 13

12 12 12 13 13

-

-

------

- -

-

-

-

- -

- -

- -

- - - - -

jul

jul

jan

jan

jun

okt

jun feb

okt

apr

feb

apr

sep

dec

sep

dec

aug

mrt

nov

aug

mrt

mei

nov mei

42

Renault Zoe Seat Toledo

350 140 300 120 250 100 200 80 150 60 100 40 50 20

0 0

13

13 13 13

13 12 12 12 13 13 13

13 13 13

12 13 13 12

12 12 12 13 13

13

-

- - -

------

- - -

- - - -

- - - - -

-

jul

jul

jan jan

jun

jun

okt okt

feb

feb

apr

apr

sep

sep

dec

dec

aug

aug

mrt mrt

nov

nov

mei mei

BMW i3 Opel Cascada

1000 250

800 200

600 150

400 100

200 50 0

0

13

13 14

13 13 13 13 13 14

14 14

13

-

- -

------

- -

-

jul

jan

jun

-

okt

feb

apr

sep

dec

aug

mrt

nov

mei

-

-

-

-

-

-

-

-

-

- 1

jul

-

jan 1

jun 1

okt 1

feb 1

1

apr

1

sep

dec 1

1

aug

mrt 1

1

nov

mei 1

Mini Paceman Mercedes-Benz GLA-klasse

60 90 80 50 70 40 60 50 30 40 20 30 20 10 10

0 0

13

13 13 14

13 14 14 14

13 12 13 13 14 14

12 13 13 13 14 14

12 12 13

13

-

- - -

- - - -

------

------

- - -

-

jul

jul

jan

jan

jun

jun

okt

okt

feb

feb

apr

apr

sep

sep

dec

dec

aug

mrt

aug

mrt

nov

nov

mei mei

Porsche Macan Peugeot 2008

400 600 350 500 300 250 400 200 300 150 200 100 50 100

0 0

14

14

13 13 14 14 14 14

14 14

14

13

13 13

12 12 13 13 13 13

13 13

13

13

-

-

------

- -

-

-

- -

------

- -

-

-

jul

jul

jan

jun

jan

okt

feb

jun

apr

okt

feb

sep apr

dec

sep

aug

mrt

nov dec

mei

aug

mrt

nov mei

Before we focus on testing the first and second hypotheses. The data from the two control variables is graphically displayed in Figure 2 (Figure 2: Graphical description total cars sold and total tweets).

43

The first plot on the right of Figure 2 motivates our first control variables that controls for the brand popularity. The total cars sold per brand vary from more than nine thousand to around hundred. For this variable we use two dummies which either controls for a high popular brand or a low popular brand. The graph shows that the Volkswagen, Peugeot and Renault are the high popular brands in terms of total cars sold in 2013 and 2014 (M=39327, SD=26573). Moreover the plots indicates that

Mini and Porsche are by far the least popular brands in terms of total cars sold in 2013 and 2014.

The second plot on the left of Figure 2 shows a clear fluctuation in car sales from one month to another. For instance, the graph shows strong peaks around January 2013 and January 2014. Both peaks are caused by changes in governmental regulations concerning emission taxes for the new year. We control for this time effect by using a dummy variable which controls for a popular time period. As both the sales in January 2013 and January 2014 appear to be outliers (M =33519,

SD=5332) compared to the rest of the months, we use a dummy variable which controls for an overlap of the introduction period with the month January in either 2013 or 2014. We call this dummy popular-period.

Figure 2: Graphical description total cars sold per brand and total cars sold

Total cars sold per brand Total cars sold 100000 60000 80000 50000 60000 40000 40000 20000 30000 0 20000

Kia 10000

Fiat

Seat

Mini

Opel

Benz

- BMW

Citroën 0

Renault

Porsche

Peugeot

13 13 14 14

13 14

13 13 14 14

13 14

- - - -

- -

- - - -

- -

Volkswagen

jul jul

jan jan

sep sep

Mercedes

mrt mrt

nov nov

mei mei

44

Hypothesis testing

After the preliminary research, we test the hypotheses by calculating the correlation coefficients and by doing a multiple linear regression analysis. The hypotheses H1, H2, H3 and H4a are tested by constructing the first regression model mentioned in the method section.

The correlation matrix shows only shows significant relationships between the two control dummies for brand effect, high-brand-popularity (r = 0.502, p < 0.05) and low-brand-popularity (r = -0.485, p <

0.05), and the dependent variable Total-sales. To further explore the correlation between the three independent and the dependent variable, the relationships are graphically displayed in three scatter plots (Appendix 2: Figure A1, A2 and A3). Note that the independent variables and the dependent variable are log transferred. The correlation matrix and scatter plots show a weak positive correlation between Total-tweets-car and Total-sales (r = 0.146, p > 0.05). The higher the pre-launch consumer tweet volume, the higher the car sales. The results also show that Total-tweets-brand and Total-sales are weak negatively correlated (r = -0.283, p > 0.05). The more tweets by the brand, the lower the sales. Likewise, Media-expenditure-pre and Total-sales show a weak negative correlation (r= -0.255, p > 0.05). The more pre-launch media expenses, the lower the total sales. Especially the latter correlation coefficient seems to be highly influenced by the car models that did not get a brand investment for promotion through the media. Besides the finding that Media-expenditure and Total- sales have a weak negative correlation, the correlation matrix and Figure A4 (Appendix 2) also shows that Media-expenditure-pre has no correlation with Total-tweets-car (r = -0.067, p > 0.05). So, the amount of pre-launch media expenses has no a cohesion with the pre-launch WOM on Twitter (Table

4: Correlation matrix regression model one). Furthermore, the correlation coefficients in Table 4 do not show a strong correlation. A regression analysis assumes the independent variables are not linear combinations of each other. In other words, independent variable matrix is with full rank (Vocht,

2007). But the correlations demonstrate that there is no multicollinearity problem.

45

Tabel 4: Correlation matrix regression model one

Variable 1 2 3 4 5 6 7

1.Total-sales Log10 - 2.Total-tweets-brand Log10 -0.283 - 3.Total-tweets-car Log10 0.146 0.257 - 4.Media-expenditure-pre Log10 -0.255 0.154 -0.067 - 5.Popular-period Dummy 0.004 -0.373 0.253 0.179 - 6.High-brand-popularity Dummy 0.502* -0.187 0.149 -0.464 -0.236 - 7.Low-brand-popularity Dummy -0.485* 0.266 0.032 0.068 -0.133 -0.236 - *. Correlation is significant at the 0.05 level (2-tailed).

In the first model of the first multiple regression, the three independent variables were entered:

Total-sales Total-tweets-car, Total-tweets-brand and Media-expenditure-pre. This model (R = 0.409) was statistically not significant F (3, 13) = 0.872; p > 0.05. Due to the fact that just seventeen new cars were introduced during our research period, the total number of observations is small and the number of predictors is relative large. In this case the adjusted R Square helps with a more accurate interpretation of the results (Vocht, 2007). The small R2 (R2 = 0.168) and negative adjusted R2

(adjusted R2 = -0.025) outcomes indicate that the model does not fit the data. The model shows that none of the variance in Total-Sales is explained by the independent variables.

In model two, after the addition of the control variables Available-Jan, High-Brand-popularity and

Low-Brand-popularity, the total variance explained by the regression model (R = 0.655) was 43% (R2 =

0.429) with F (6, 10) = 1.253: p > 0.05. But, as mentioned before, the adjusted R2 gives a more accurate interpretation. This outcome indicates that the model explains just 9% (adjusted R2 = 0.087) of the variance in Total-sales. Thus, the second model was also not significant and the variance in

Total-sales explained by the model is low. The introduction of the dummies, Popular-period, High-

Brand-popularity and Low-Brand-popularity, explained an additional 26% variance in Total-sales (R2

Change = 0.262; F (3, 10) = 1.528; p > 0.05). Although no significant change, there seems to be an increase in explanatory power of the model with the dummy variables and the increase in the adjusted R2 also indicates that the addition of the dummy variable seems to have added explanatory

46

power even when the model is adjusted for degrees of freedom. In the final model none of the predictors Total-tweets-brand (β =-0.308, p > 0.05), Total-tweets-car (β = 0.212, p > 0.05) or Media- expenditure-pre (β = -0.193, p > 0.05) are statistically significant. Although not significant, there do seems to be a positive effect from Total-tweet-car on Total-sales. The effect of Media-expenditure- pre on Total-sales does not appear to be present. Total-tweets-brand seems to have a negative influence on Total-Sales. The dummy variable High-brand-popularity seems to have positive effect on

Total-sales while Low-brand popularity and Popular-period seem to have a negative effect on Total- sales. The latter is rather surprising as you would expect a car which introduction period overlaps a popular buying period to be sold more frequently during this launch period (Table 5: Results regression model one).

Table 5: Results regression model one

R R² Adjusted R2 R² Change B SE β t

Model 1 0.409 0.168 -0.025 Total-tweets-brand Log10 -0.339 0.293 -0.308 -1.158 Total-tweets-car Log10 0.155 0.192 0.212 0.806

Media- Log10 -0.036 0.048 -0.193 -0.749 expenditure-pre Model 2 0.655 0.429 0.087 0.262 Total-tweets-brand Log10 -0.220 0.349 -0.200 -0.632 Total-tweets-car Log10 0,133 0.214 0.182 0.621 Media- Log10 -0.004 0.051 -0.023 -0.084 expenditure-pre High-Brand- Dummy 0.303 0.292 0.319 1.040 popularity Low-Brand- Dummy -0.500 0.343 -0.372 -1.457 popularity Popular-period Dummy -0.117 0.433 -0.087 -0.270

To further investigate the relation between pre-launch tweets volumes and post-launch initial sales and to execute a research method which is more consistent with the methods used by Asur and

Huberman (2010), we examine the relationship between the two variables using an one month pre- and post-launch period. In this we acknowledge the dataset limitation mentioned earlier in the method section. The hypotheses H1 is further tested by constructing the second regression model

47

mentioned in the method section. In the regression model we also control for the brand and time effect with dummy variables.

The correlation matrix shows no significant relationships (Table 6: Correlation matrix regression model two). So, there is no multicolinearity problem. To further explore the correlation between the independent variable and the dependent variable are graphically displayed in a scatter plot

(Appendix 2: Figure A5). Note that the independent variable Tweets-month-pre is log transferred.

The correlation matrix and scatter plots show a weak positive correlation between Tweets-month- pre and Sales-month-post (r = 0.140, p > 0.05). The more tweets one month before the launch of a new car model the more sales in the first month after the car becomes available.

Table 6: Correlation matrix regression model two

1 2 3 4 5

1.Popular-period Dummy - 2.High-brand- Dummy -0.236 - popularity 3.Low-brand- Dummy -0.133 -0.236 - popularity 4.Tweets-month-pre Log10 0.169 0.103 0.014 - 5.Sales-month-post -0.083 0.297 -0.231 0.140 -

In the first model of the second regression model the independent variable Tweets-month-pre is entered. This model (R = 0.140) was statistically not significant F (1, 15) = 0.300; p > 0.05. The small R2

(R2 = 0.020) and negative adjusted R2 (adjusted R2 =-0.046) results indicate that the model does not fit the data. The model shows that none of the variance in Sales-month-post is explained by the independent variable.

In model two, after the addition of the control variables Available-Jan, High-Brand-popularity and

Low-Brand-popularity, the total variance explained by the model (R = 0.367) was 14% (R2 =0.135) F

(4, 12) = 0.468: p > 0.05. Thus, the second model was also not significant. And, as mentioned before, the adjusted R2 gives a more accurate interpretation. This outcome indicates that the model explains

48

none of the variance (adjusted R2 = -0.154) in Total-sales. The introduction of the dummies, Popular- period, High-Brand-popularity and Low-Brand-popularity, explained an additional 12% variance in

Total-sales (R2 Change = 0.115; F (3, 12) = 0.533; p > 0.05). If significant, this would indicate a increase in explanatory power of the model with the dummy variables, but the fact that the adjusted

R2 remains negative indicates that the addition of the dummy variable has not added explanatory power. In the final model the predictor Tweets-month-pre (β =0.134, p > 0.05) is not significant.

Although the results are not significant and very weak, there seems to be some positive effect of

Tweets-month-pre on Total-sales (Table 7: Results regression model two).

Table 7: Results regression model two

R R² Adjusted R² Change B SE β t R2 Model 1 0.140 0.020 -0.046 Tweets-month-pre Log10 6.851 12.502 0.140 0.548 Model 2

Tweets-month-pre Log10 0.367 0.135 -0.154 0.115 6.531 13.518 0.134 0.483 Popular-period Dummy -8.168 29.720 -0.079 -0.275

High-Brand- Dummy 15.971 21.239 0.219 0.752 popularity Low-Brand- Dummy -19.792 29.145 -0.192 -0.679 popularity

The previous results in the preliminary analysis and the first constructed linear regression model indicate that most brands do not engage substantially in pre-launch investments for promotion through the media channels. To further explore the relationship between brand media expenses and sales and to enable a comparison between the influence of pre-launch WOM and post-launch brand media investment, we examine the relationship between post-launch brand investment and the initial car sales over a period of six months. The hypotheses H4b is tested by constructing the third regression model mentioned in the method section. In the regression model we also control for the brand and time effect.

49

As in the first regression model, the correlation matrix shows a significant relationship between the two control dummies for brand effect, high-brand-popularity (r = 0.502, p < 0.05) and low-brand- popularity ( r = -0.485, p < 0.05), and the dependent variables Total-sales. More interestingly, the matrix also indicates a significant correlation between Media-expenditure-post and the dependent variable Total-sales. To further explore the correlation between Media-expenditure-post and the

Total-sales, the variables are graphically displayed in a scatter plot (Appendix 2: Figure A6). Note that the dependent variable is log transferred. The correlation matrix and scatter plots show a strong positive correlation (Vocht,2007) between Media-expenditure-post and Total-sales (r = 0.629, p <

0.05). The higher the media expenses by the brand after the launch period, the higher the initial car sales. The correlation matrix shows no very strong relationships (r > 0.90). So, there is no multicolinearity problem (Table 8: Correlation matrix regression model three)

Table 8: Correlation matrix regression model three

1 2 3 4 5 1. Popular-period Dummy - 2.High-brand-populairty Dummy -0.236 - 3.Low-brand-popularity Dummy -0.133 -0.236 - 4.Total-sales Log10 0.004 0.502* -0.485* - 5.Media-expenditure-post 0.419 0.223 -0.443 0.629** - *. Correlation is significant at the 0.05 level (2-tailed). **. Correlation is significant at the 0.01 level (2-tailed).

In the first model of third multiple regression model, the independent variable Media-expenditure- post is entered. This regression model (R =0.629) was statistically significant F (1, 15) = 9.829; p =

0.007 and explained 40% of variance in Total-sales. But, as mentioned before, the adjusted R2 gives a more accurate interpretation. This outcome indicates that the model explains 35% (adjusted R2 =

0.356) of the variance in Total-sales. So, 35% of the variation in Total-sales in explained by the post launch media investments by the brand.

50

In model two, after the addition of the control variables Available-Jan, High-Brand-popularity and

Low-Brand-popularity, the total variance explained by the model (R =0.767) was 59% F (4, 12) =

4.290; p = 0.022. This model was also statistically significant and the adjusted R Square indicates that this second model explains 45% (adjusted R2 =0.451) of the variance in Total-sales. The introduction of Popular-period, High-Brand-popularity and Low-Brand-popularity explained an additional 19% variance in Total-sales (R2 Change = 0.193; F (3, 12) = 1.872; p > 0.05). Thus, the change was not significant. Although no significant change, there seems to be an increase in explanatory power of the model with the dummy variables and the increase in the adjusted R2 also shows that the addition of the dummy variable has added explanatory power. In Model 2 Media-expenditure-post (β = 0.557, p = 0.036) is significant. The outcome show that if Media-expenditure-post increases by one euro, the car sales of the respective model increases by 1.905E-07. Although not significant, the dummy variable High-brand-popularity seems to have positive effect (B = 0.274, p > 0.05) on Total-sales while

Low-brand popularity (B = -0.262, p > 0.05) and Popular-period ( B = -0.251, p > 0.05) seem to have a negative effect on Total-sales (Table 9: Results regression model three).

Table 9: Results regression model three

R R² Adjusted R² B SE β t R2 Change Model 1 0.629 0.396 0.356 Media-expenditure- 2.153E-07 0.000 0.629 3.135 post Model 2

Media-expenditure- 0.767 0.588 0.451 0.193 1.905E-07 0,000 0.557 2.363 post Popular-period Dummy -0.251 0.296 -0.187 -0.850 High-Brand-popularity Dummy 0.274 0.197 0.288 1.392 Low-Brand-popularity Dummy -0.262 0.281 -0.195 -0.932

Considering the results from hypothesis testing in total, we can infer the conclusions in Table 10.

51

Table 10: Results hypothesis testing

Hypothesis Results

H1. New cars that are more discussed on Twitter during the pre-launch phase, sell better in the post-launch period. Rejected

H2. Brands that initiate higher publicity in the pre-launch phase through Twitter have higher initial new car sales. Rejected

H3. Higher brand investment in pre-market mass media promotion for a new car leads to more pre-market Twitter Rejected volume.

H4a. Higher brand investment in pre-market mass media promotion leads to higher initial new car sales. Rejected

H4b. Higher brand investment in post-market mass media promotion leads to higher initial new car sales. Supported

Discussion and conclusions

The results of this thesis show no strong support for the ability to predict initials car sales of new car with pre-launch Twitter volumes. We did not find a significant relationship between consumers tweet volumes about a certain new model and its related initial car sales after it becomes available.

Also, during this study we did not find a statistically significant relationship between brands that initiate higher publicity in the prelaunch phase and the brand’s initial car sales of a new car model.

Moreover, we didn’t find a relationship between the amount of money a brand invests in pre-market mass media promotion for a new car and the online WOM on Twitter. Furthermore, the results reject a relationship between amount of money a brand invests in pre-market mass media promotion for a new car and the initial sales of the new car after directly after it becomes available.

Based on our results we can not assume the same role of online WOM on Twitter in the pre-launch phase of a new product as suggested by Asur and Huberman (2010). Despite the research similarities, our study does not support earlier findings by Asur and Huberman (2010) who showed a clear relationship between Twitter volumes and new products sales and emphasized the possibilities to predict real world outcomes with social data. Our outcomes indicate no earlier role for online WOM then after the launch, as assumed by Holbrook and Addis (2007), Chen and Xie (2008) Dellarocas

(2003), Li and Hitt (2008), Miller, Fabian and Lin (2009) and Lui (2006). We did not research the

52

relationship between post-launch online WOM on Twitter and initials new cars sales after a car becomes available as we were intended to find a pre-launch forecasting method. Moreover post- launch measurements would lead to endogenity problems, whereby tweets volumes and sales are mutually influenced by each other.

An explanation for the lack of support for the study of Asur and Huberman (2010) is that we examined a product from a different category. In our study we researched a technical product that consumers can evaluate by specific attributes before purchase. Asur and Huberman (2010) investigated the box-office revenues of movies which is a product that requires feeling or experiencing and is more difficult to describe using specific attributes. In addition, experience products can evoke different experiences across consumers (Cui, Lui and Guo,2012). According Cui,

Lui and Guo (2012), experience products are often well promoted prior to their release and attract customer reviews within a short period after their public release. Moreover, the product category influences the message and the sender of the tweet. Furthermore the product category in our research in intended for a different audience than the product used by Asur and Huber (2010).

The absence of a relationship between consumer tweets volumes and initial car sales give support for some critical comments made by Cui, Lui and Guo (2012). Cui, Lui and Guo (2012) discuss social contagion literature which suggests that people’s decision to engage in a new behavior, such as buying a car, does not depend on WOM but on the choices of other people. Hence, the demand for products evolves partly as a function of interpersonal communication and social learning processes shape demand for new products. The more people use a product the more it leads to a large scale spread of information. So the main influencers are not the early adopters but a critical mass of easily influenced latecomers. This suggestion questions the dominant role of early adopters in the diffusion of new products and highlight the influence of latecomers in the diffusion process. Cui, Lui and Guo

(2012) also highlight the difficulty of identifying and targeting the early adopters. Due to the lack of

53

significant relationships between pre-launch brand media expenses and pre-launch online WOM on twitter, it becomes also relevant to the question whether the assumption that pre-launch WOM is caused by the early adopters is correct.

In the theory section, we identified multiple sources who influence consumer decision making.

Although WOM has been recognized as one of the main influencers, the literature also highlights advertisement as an important influencers (Urban, Weinberg and Hauser, 1996; Bennett and

Mandell, 1969; Newman and Staelin, 1972). We did find a significant and strong positive relationship between post-market brand investment in mass media promotion of a new car and the initial new car sales. The results indicate the capabilities of post-launch media expenses to forecast initials sales of new car models. Our research shows that the post-launch media expenses by a particular brand is a much better predictor of initials car sales than the pre-launch consumer and brand tweet volumes and the pre-launch media expenses. Note that this relationship has been found between two variables that are measured over the same time period. This result strongly supports the findings of

Clarke (1976) and Heyse and Wei (1985) who found a strong relationship between sales and advertisement in the same time period. Besides the previous significant relationship, our results also show that pre-launch online WOM on Twitter seems to have a positive influence on the initial sales of new cars. Although not significant and not strong, it does provide an opportunity for further research.

The absence of significant relationships between pre-launch consumer tweet volume and initial car sales can be explained in different ways. As mentioned before, previous research identified multiple sources of source of information in the consumer decision making process. WOM has been recognized as an important influencers by Urban, Weinberg and Hauser (1996), Bennett and Mandell

(1969) Newman and Staelin (1972). But it is important to notice that these articles also indicate that

WOM is just one of the many sources of information in the purchase decision process for a new car.

54

Other sources mentioned in these articles are reports, showroom visits, expert opinion, friends opinion, reading brochures, discussion with spouse, auto show, advertisement, new articles, discussion with children as sources of information in the new car purchase decision process. The influence of WOM is heavily reduced by these other sources. It is also important to notice that we researched the online WOM on Twitter which is just a portion of the total WOM. And on top of that, not every WOM communication is relevant as a source of information for the consumer purchase decision. The literature section indicates that key features in the vehicle purchase process are warranty, type, price, style, safety, running costs, re-sale, reputation, reliability, price, performance, model, fuel, country and comfort (Koppel, Charlton and Fildes, 2006; Urban Weinberg and Hauser

1996). The messages on Twitter about a particular new car might not have contained such information which influences the effect of online WOM on initials car sales.

The absence of significant relationships between pre-launch consumer tweet volume and initial car sales can also be explained by some outliers in the dataset. Figure A1 (Appendix 2) shows that some models are outliers. For instance, the Porsche Macan which is a relatively well discussed in the pre- launch phase but not many people are able to buy such an expense vehicle. These outliers have a big impact on the analysis due to the fact that all suitable car models for this research accumulated to a total of just seventeen cars.

In this study we did not find a significant relationship between pre-market brand consumer tweets and initials car sales. This has merely to do with the absence of a substantial volume of brand tweets in the pre-launch phase of a new car. The absence of substantial pre-market publicity initiated by the brand corresponds with the findings of Asur and Huberman (2010) who examined the pre-release promotional information on Twitter generated by movie and media companies and producers. The absence of a the significant relationships between brand media expenses and both pre-launch online

WOM and post-launch initials car sales can be explained by examining the pre-launch brand media

55

expenses of each brand. As shown in figure 3A and 4A just six out of seventeen brand appear to invest in mass media car model promotion. This is a very surprising result if we recall the finding in the theory section. The theory by Bass (2004) and Rogers (2004) proposes that early adopters of a new product are influenced by advertisement. Subsequently, these early adopters influence early and late majority by WOM. Our results conflict with this theory on the part that our results indicate pre-release online WOM without pre-release advertisement. The observed WOM on Twitter in the pre-launch phase can be explained by pre-launch campaigns on social media and through mailings.

These marketing campaigns can be supported by PR activities, such as reports on the new car sent to magazines in the automotive industry. It is important to notice the information acquired about media expenses from the various brands did not include the mass expenditure on online display.

Furthermore, it did not include PR expenses.

On a more general note, we should also recall that this study is limited to the Dutch market. A similar study with different datasets from various countries can shed light into different conclusions. It also noteworthy that the Twitter volume data is selected using specific search queries in the search tool

(Appendix 1). In this research the most basic form of search queries has been applied by using the full written name of the model and the brand. Because of the dynamics of language and use of language this seemed to be the most fair comparison of the different brand and model volumes. On the negative side it does increase the risk of the relative error noise of the Twitter volumes. Furthermore, it increases the chance of mentions about car models and brands being left out of the dataset. For instance the Mercedes-Benz GLA-klasse or CLA-klasse. There is a chance that the full names of these models are not used by the Twitter users. If this is the case, it would strongly affect our results. As the other names used for these models are not included in our dataset.

56

Managerial implications

Although this study does not find support the opportunities to predict initials car sales with pre- launch online WOM, the results did indicate a positive correlation between the pre-launch volume of consumer tweets and the car sales after the model becomes available. Our results show some support for the findings of Cui, Lui and Guo (2012) and Asur and Huberman (2010) that online WOM has become an important driving force in, and predictor of, new product sales. Online data contains information about future consumer behavior. For organizations this means that an effective online marketing strategy has become a key success factor for new product launch. To harness the power of online WOM, marketers should make work of analyzing online data to learn more about their consumers and to improve management decision making. In the long-term this will improve business practices and results (Feit et al., 2013). Based on the analysis they can, for instance, start identifying influencers, plan marketing communications and stimulate the WOM.

Our results did not show a relationship between pre-launch advertising and pre-launch online WOM mainly due to the fact that companies do not appear to invest in car model promotion in the pre- launch phase. Because we did find a weak positive relation between pre-launch online WOM and initial car sales and as the theory is suggesting a relationship between advertisement and WOM, it would be interesting for manager to see if they can stimulate this pre-launch online WOM with targeted media spending. Likewise, managers should attempt to initiate higher pre-launch online

WOM using their own social media accounts. In this study we only examined the Twitter volumes but marketers need to consider the distinctive influences of various aspects of online WOM when launching new products and devising online marketing strategies. In example, marketers should consider a sentiment analysis to measure the attitude from the consumers towards the product or brand.

57

Further research

Although this study does not yield clear results, it provides enough grip for further investigation.

Previous research has already repeatedly demonstrated the value of Twitter. Due to the fact that this study only had access to data on a monthly basis, we were forced to aggregate data. Further research should try to gather data on a daily basis, this should give a more clear view on the various relationships. Furthermore, next research should include other variables which better describe the tweets volumes in the months before the launch of a new car model. We would suggest a variables which account for the different peaks of volumes in the months before release, because the graphical data shows that thirteen of the seventeen car models show at least one peak in the pre- launch Twitter volumes. Also, it would be interesting to gather more metrics on a individual tweet level. This could for example be information about the sentiment of a tweet or about the person who is sending the tweets, such as the size of a person’s network. Another interesting topic for further research would be to find other variables who complement online WOM and advertisement in predicting initials car sales of new cars. Urban, Weinberg and Hauser (1996), Bennett and Mandell

(1969) Newman and Staelin (1972) already indicated other sources of information for consumer car purchase decision, like showroom visits, expert opinion and car model price, that can be used to predict new car sales. Koppel, Charlton and Fildes (2006) also mentioned some key features in the vehicle purchase process like price, safety, running costs, oil price re-sale, reputation and reliability which can increase the forecasting possibilities of new car sales. To conclude, during this study we investigated the influence of online consumer and brand WOM on initials sales of new cars. It would also be interesting to examine the long term results of tweet volumes.

58

References

Achrekar, H., Gandhe, A., Lazarus, R., Yu, S. & Liu, B. 2011, "Predicting flu trends using twitter data", Computer Communications Workshops (INFOCOM WKSHPS), 2011 IEEE Conference onIEEE, , pp. 702.

Amblee, N. & Bui, T. 2008, "Can brand reputation improve the odds of being reviewed on-line?", International Journal of Electronic Commerce, vol. 12, no. 3, pp. 11-28.

Anderson, E.W. 2003, "The Formation of Market‐Level Expectations and Its Covariates", Journal of Consumer Research, vol. 30, no. 1, pp. 115-124.

Antweiler, W. & Frank, M.Z. 2004, "Is all that talk just noise? The information content of internet stock message boards", The Journal of Finance, vol. 59, no. 3, pp. 1259-1294.

Asur, S. & Huberman, B.A. 2010, "Predicting the future with social media", Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference onIEEE, , pp. 492.

Azevedo, H. 2011, April 26-last update, The Netherlands Ranks #1 Worldwide in Penetration for Twitter and Linkedin [Homepage of Comscore], [Online]. Available: http://www.comscore.com/Insights/Press-Releases/2011/4/The-Netherlands-Ranks-number- one-Worldwide-in-Penetration-for-Twitter-and-LinkedIn [2014, 10/13].

Barney, J. 1991, "Firm resources and sustained competitive advantage", Journal of management, vol. 17, no. 1, pp. 99-120.

Bass, F.M. 2004, "Comments on “a new product growth for model consumer durables the bass model”", Management science, vol. 50, no. 12_supplement, pp. 1833-1840.

Bennett, P.D. & Mandell, R.M. 1969, "Prepurchase information seeking behavior of new car purchasers: The learning hypothesis", Journal of Marketing Research, , pp. 430-433.

BOVAG 2014, November-last update, Verkoopcijfers auto [Homepage of BOVAG], [Online]. Available: http://www.bovag.nl/over-bovag/cijfers/verkoopcijfers-auto [2014, 12/28].

BOVAG , Wat is BOVAG? [Homepage of BOVAG], [Online]. Available: http://www.bovag.nl/over- bovag [2014, 10/13].

Chen, Y. & Xie, J. 2008, "Online consumer review: Word-of-mouth as a new element of marketing communication mix", Management Science, vol. 54, no. 3, pp. 477-491.

Choi, H. & Varian, H. 2012, "Predicting the present with Google trends", Economic Record, vol. 88, no. s1, pp. 2-9.

Clarke, D.G. 1976, "Econometric measurement of the duration of advertising effect on sales", Journal of Marketing Research, , pp. 345-357.

Clemons, E.K., Gao, G.G. & Hitt, L.M. 2006, "When online reviews meet hyperdifferentiation: A study of the craft beer industry", Journal of Management Information Systems, vol. 23, no. 2, pp. 149- 171.

59

Cooper, C.P., Mallon, K.P., Leadbetter, S., Pollack, L.A. & Peipins, L.A. 2005, "Cancer Internet search activity on a major search engine, United States 2001-2003", Journal of medical Internet research, vol. 7, no. 3, pp. e36.

Cui, G., Lui, H. & Guo, X. 2012, "The effect of online consumer reviews on new product sales", International Journal of Electronic Commerce, vol. 17, no. 1, pp. 39-58.

De Choudhury, M., Sundaram, H., John, A. & Seligmann, D.D. 2008, "Can blog communication dynamics be correlated with stock market activity?", Proceedings of the nineteenth ACM conference on Hypertext and hypermediaACM, , pp. 55.

Dellarocas, C. 2006, "Strategic manipulation of internet opinion forums: Implications for consumers and firms", Management Science, vol. 52, no. 10, pp. 1577-1593.

Dellarocas, C. 2003, "The digitization of word of mouth: Promise and challenges of online feedback mechanisms", Management science, vol. 49, no. 10, pp. 1407-1424.

Dellarocas, C., Zhang, X.M. & Awad, N.F. 2007, "Exploring the value of online product reviews in forecasting sales: The case of motion pictures", Journal of Interactive marketing, vol. 21, no. 4, pp. 23-45.

Dichter, E. 1966, "{How word-of-mouth advertising works}", Harvard business review, vol. 44, no. 6, pp. 147-160.

Eliashberg, J. & Sawhney, M.S. 1994, "Modeling goes to Hollywood: Predicting individual differences in movie enjoyment", Management Science, vol. 40, no. 9, pp. 1151-1173.

Ettredge, M., Gerdes, J. & Karuga, G. 2005, "Using web-based search data to predict macroeconomic statistics", Communications of the ACM, vol. 48, no. 11, pp. 87-92.

Feit, E.M., Wang, P., Bradlow, E.T. & Fader, P.S. 2013, "Fusing Aggregate and Disaggregate Data with an Application to Multiplatform Media Consumption", Journal of Marketing Research, vol. 50, no. 3, pp. 348-364.

Goel, S. & Goldstein, D.G. 2013, "Predicting Individual Behavior with Social Networks", Marketing Science, .

Goel, S., Hofman, J.M., Lahaie, S., Pennock, D.M. & Watts, D.J. 2010, "Predicting consumer behavior with Web search", Proceedings of the National Academy of Sciences of the United States of America, vol. 107, no. 41, pp. 17486-17490.

Goldman Sachs Group 2014, The Internet of Things: Making sense of the next mega-trend., Goldman Sachs Group.

Granello, D.H. & Wheaton, J.E. 2004, "Online data collection: Strategies for research", Journal of Counseling & Development, vol. 82, no. 4, pp. 387-393.

Grewal, R., Cline, T.W. & Davies, A. 2003, "Early-entrant advantage, word-of-mouth communication, brand similarity, and the consumer decision-making process", Journal of Consumer Psychology, vol. 13, no. 3, pp. 187-197.

60

Heyse, J.F. & Wei, W.W. 1985, "Modelling the advertising‐sales relationship through use of multiple time series techniques", Journal of Forecasting, vol. 4, no. 2, pp. 165-181.

Holbrook, M.B. & Addis, M. 2008, "Art versus commerce in the movie industry: a Two-Path Model of Motion-Picture Success", Journal of Cultural Economics, vol. 32, no. 2, pp. 87-107.

Huang, P., Lurie, N.H. & Mitra, S. 2009, "Searching for experience on the web: an empirical examination of consumer behavior for search and experience goods", Journal of Marketing, vol. 73, no. 2, pp. 55-69.

Huberman, B.A., Romero, D.M. & Wu, F. 2008, "Social networks that matter: Twitter under the microscope", arXiv preprint arXiv:0812.1045, .

Hui, S.K., Bradlow, E.T. & Fader, P.S. 2009, "Testing behavioral hypotheses using an integrated model of grocery store shopping path and purchase behavior", Journal of consumer research, vol. 36, no. 3, pp. 478-493.

Jansen, B.J., Zhang, M., Sobel, K. & Chowdury, A. 2009, "Twitter power: Tweets as electronic word of mouth", Journal of the American Society for Information Science and Technology, vol. 60, no. 11, pp. 2169-2188.

Java, A., Song, X., Finin, T. & Tseng, B. 2007, "Why we twitter: understanding microblogging usage and communities", Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysisACM, , pp. 56.

King, M.F. & Balasubramanian, S.K. 1994, "The effects of expertise, end goal, and product type on adoption of preference formation strategy", Journal of the Academy of Marketing Science, vol. 22, no. 2, pp. 146-159.

Koppel, S., Charlton, J. & Fildes, B. 2007, "How important is vehicle safety in the new vehicle purchase/lease process for fleet vehicles?", Traffic injury prevention, vol. 8, no. 2, pp. 130-136.

Lampos, V. & Cristianini, N. 2010, "Tracking the flu pandemic by monitoring the social web", .

Legatum institute 2014, , Prosperity index suffleboard [Homepage of Legatum institute], [Online]. Available: http://www.prosperity.com/#!/?aspxerrorpath=%2Fdefault.aspx [2014, 10/13].

Li, X. & Hitt, L.M. 2008, "Self-selection and information role of online product reviews", Information Systems Research, vol. 19, no. 4, pp. 456-474.

Liu, Y. 2006, "Word of mouth for movies: Its dynamics and impact on box office revenue", Journal of Marketing, vol. 70, no. 3, pp. 74-89.

Miller, K.D., Fabian, F. & Lin, S. 2009, "Strategies for online communities", Strategic Management Journal, vol. 30, no. 3, pp. 305-322.

Mudambi, S.M. & Schuff, D. 2010, "What makes a helpful online review? A study of customer reviews on Amazon. com", Management Information Systems Quarterly, vol. 34, no. 1, pp. 11.

Neelamegham, R. & Chintagunta, P. 1999, "A Bayesian model to forecast new product performance in domestic and international markets", Marketing Science, vol. 18, no. 2, pp. 115-136.

61

Newman, J.W. & Staelin, R. 1972, "Prepurchase information seeking for new cars and major household appliances", Journal of Marketing Research, , pp. 249-257.

Phelps, J.E., Lewis, R., Mobilio, L., Perry, D. & Raman, N. 2004, "Viral marketing or electronic word-of- mouth advertising: Examining consumer responses and motivations to pass along email", Journal of Advertising Research, vol. 44, no. 04, pp. 333-348.

Polgreen, P.M., Chen, Y., Pennock, D.M. & Nelson, F.D. 2008, "Using internet searches for influenza surveillance", Clinical infectious diseases : an official publication of the Infectious Diseases Society of America, vol. 47, no. 11, pp. 1443-1448.

RDW , ABOUT RDW [Homepage of RDW], [Online]. Available: http://www.rdw.nl/englishinformation/Paginas/About- RDW.aspx?path=Portal/Information%20in%20English/About%20RDW [2014, 10/13].

Rogers, E.M. 2004, "A prospective and retrospective look at the diffusion model", Journal of health communication, vol. 9, no. S1, pp. 13-19.

Rossi, P.E., McCulloch, R.E. & Allenby, G.M. 1996, "The value of purchase history data in target marketing", Marketing Science, vol. 15, no. 4, pp. 321-340.

Saunders, M. & Lewis, P. 2011, Doing research in business & management, 1st edition edn, Pearson Education Limited.

The Economist Intelligence Unit 2013, , The Economist Intelligence Unit.

Thomas, G.M. 2004, "Building the buzz in the hive mind", Journal of Consumer Behaviour, vol. 4, no. 1, pp. 64-72.

Tumasjan, A., Sprenger, T.O., Sandner, P.G. & Welpe, I.M. 2010, "Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment.", ICWSM, vol. 10, pp. 178-185.

Urban, G.L., Weinberg, B.D. & Hauser, J.R. 1996, "Premarket forecasting of really-new products", The Journal of Marketing, , pp. 47-60.

Véronis, J. 2007, "Citations dans la presse et résultats du premier tour de la présidentielle 2007", Retrieved December, vol. 15, pp. 2009.

Vocht, A. 2007, Basishandboek SPSS 15 voor Windows, First edition edn, Bijleveld Press.

Weathers, D., Sharma, S. & Wood, S.L. 2007, "Effects of online communication practices on consumer perceptions of performance uncertainty for search and experience goods", Journal of Retailing, vol. 83, no. 4, pp. 393-401.

Wikipedia 2014, 11/06/2014-last update, Twitter [Homepage of Wikipedia], [Online]. Available: http://en.wikipedia.org/wiki/Twitter [2014, 06/15].

Worldbank 2011, , Motor vehicles per 1,000 people [Homepage of The World Bank], [Online]. Available: http://data.worldbank.org/indicator/IS.VEH.NVEH.P3?order=wbapi_data_value_2011+wbapi_d ata_value+wbapi_data_value-last&sort=desc [2014, 10/13].

62

Zarrella, D. 2009, "Science of retweets", Retrieved December, vol. 15, pp. 2009.

Zhang, X., Fuehres, H. & Gloor, P.A. 2011, "Predicting stock market indicators through twitter “I hope it is not as bad as I fear”", Procedia-Social and Behavioral Sciences, vol. 26, pp. 55-62.

63

Appendix 1: Search queries Twitter

Table A1: Search queries Twitter

Models Topic search Brand search Twitter account

Renault Captur Renault Captur lang:nl Renault lang:nl Renault_nl

Peugeot 2008 Peugeot 2008 lang:nl Peugeot lang:nl PeugeotNL

Opel Adam Opel adam lang:nl Opel lang:nl Opel_Nederland

Kia Carens Kia Sarens lang:nl Kia lang:nl Kia_motors_nl

Peugeot 108 Peugeot 108 lang:nl Peugeot lang:nl PeugeotNL

Volkswagen Golf Volkswagen Golf Sportsvan lang:nl Volkswagen lang:nl VolkswagenNL

Sportsvan

Citroen C4 Cactus Citroën C4 Cactus lang:nl Citroën lang:nl CitroenNL

BMW 3 serie Gran BMW 3 serie Gran Turismo lang:nl BMW lang:nl BMWgroup_NL

Turismo

Mercedes-Benz CLA- Mercedes Benz CLA klasse lang:nl Mercedes Benz lang:nl Mercedesbenz_nl klasse

Fiat 500L Fiat 500L lang:nl Fiat lang:nl Fiatnederland

Renault Zoe Renault Zoe lang:nl Renault lang:nl Renault_nl

Seat Toledo Seat Toledo lang:nl Seat lang:nl Seat_nl

BMW i3 BMW i3 lang:nl BMW lang:nl lang:nl BMWgroup_NL

Opel Cascada Opel Cascada lang:nl Opel lang:nl Opel_Nederland

Mini Paceman Mini Paceman lang:nl Mini lang:nl Mini_nl

Mercedes-Benz GLA- Mercedes Benz GLA klasse lang:nl Mercedes Benz lang:nl Mercedesbenz_nl klasse

Porsche Macan Porsche Macan lang:nl Porsche lang:nl No Twitter account

64

Appendix 2: Correlation plots

Figure A1: Correlation plot Total-tweets-car and Total-sales

4 3,5 3 2,5

2 sales (log10) sales - 1,5

Total 1 0,5 0 0 0,5 1 1,5 2 2,5 3 3,5

Total-tweets-car (log10)

Figure A2: Correlation plot Total-tweets-brand and Total-sales

4 3,5 3 2,5

2 sales (log10) sales - 1,5

1 Total 0,5 0 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6

Total-tweets-brand (log10)

Figure A3: Correlation plot Media-expenditure-pre and Total-sales

4

3,5

3

2,5

2 sales (log10) sales - 1,5

Total 1 0,5

0 0 1 2 3 4 5 6 7

Media-expenditure-pre (log10)

65

Figure A4: Correlation plot Media-expenditure-pre and Total-sales

3,5 3 2,5 2

sales (log10) sales 1,5 -

Total 1 0,5 0 0 1 2 3 4 5 6 7 Media-expenditure-pre (log10)

Figure A5: Correlation plot Tweets-month-pre and Sales-month-post

140

120

100 post

- 80 month

- 60

Sales 40

20

0 0 0,5 1 1,5 2 2,5 3

Tweets-month-pre (log10)

Figure A6: Correlation plot Media-expenditure and Total-sales

4 3,5 3 2,5

2 sales (log10) sales - 1,5

Total 1 0,5 0 0 1000000 2000000 3000000 4000000 5000000

Media-expenditure-post

66

Appendix 3: Key findings literature review

Table 1: Key findings literature review

Topic Source Findings Tweet volumes Asur and Huberman (2010) The more pre-launch online WOM about a particular topic the higher the box-office revenues for movies. Zhang, Fuehres and Gloor The more positive tweets about a financial market the higher (2011) the chance of a financial up rise. Cui, Lui and Guo (2012) The volume of online products reviews have a strong effect on product sales. The rationale behind the effect of volume of product reviews is that discussions about a product in online forums lead to increased awareness among consumers.

The volume of reviews has a significant positive effect on new product sales in the early period of a product life cycle. This effect decreases over time, which signals the significant role played by early reviews. Lui (2006) The volume of messages on newly released movies has proven to be a good predictor of their box office success. Achrekar et al. (2011) The volume of flu related tweets is highly correlated with the number of fever cases reported Tumasjan et al. (2010) The amount of tweets concerning a political party or persons reflects voter preferences and comes close to traditional election polls. Mass media Clarke (1976) The positive effect from advertisements on sales occurs within three to nine months. Heyse and Wei (1985) Sales and future advertising were mostly related in one period. Specifically, a strong connection exists between advertising budgets and current sales. Advertising budgets are often set as percentage of sales. Cui, Lui and Guo (2012) In the early stage of a product life cycle innovators are mainly affected by mass media. And after using the new products they pass their opinions to latecomers via the WOM channels. Dichter (1966) There exists a ready-made market of influencers who can be reached and, in turn, influenced by advertising. The rest of the consumers are influenced by these influentials through WOM. Brand communication Asur and Huberman (2010) Prior to the release of a movie, media companies and about product producers generate promotional information in the form of trailer videos, news, blogs and photos. Search goods Cui, Lui and Guo (2012) Search product are more subject to the valence of reviews and the volume of page views Cui, Lui and Guo (2012) Search products are goods that consumers can evaluate by specific attributes before purchase, such as electronics. Experience goods Cui, Lui and Guo (2012) Experience products are more subject to the influence of the volume of reviews. Cui, Lui and Guo (2012) Experience products require feeling or experiencing and are more difficult to describe using specific attributes. Experience products can evoke different experiences across consumers. Popularity of brand Newman and Staelin (1972) The purchase and use of a product result in learning which later influences buying behavior. Online WOM Cui, Lui and Guo (2012) Online WOM has become an important driving force in new product sales. An increasing number of previous studies have found a positive relationship between online consumer reviews and product sales, including books, movies, and video games. Jansen et al. (2009) One fifth of a random sample of tweets contained mentions of a product or brand

67

Asur and Huberman (2010) Social media expresses a collective wisdom which, when properly used, can yield an extremely powerful and accurate indicator of future outcomes. Liu (2006) Online WOM about a film concentrates on the weeks before and after the release day Holbrook and Addis (2007) Online WOM increases with a film's budget and WOM positively impacts the revenues of the film. Online WOM is an approximation for the media exposure. Car sales Urban Weinberg and Hauser Consumers use showroom visits, advertising, magazine articles (1996) and word-of-mouth in their search for information about automobiles. Newman and Staelin (1972) Identified sources for information seeking are categorized in friends or neighbors; books, pamphlets, magazine or newspaper articles; newspaper or magazine advertisements; television commercials and other sources, such as repairmen or mechanics. Koppel, Charlton and Fildes Key features in the vehicle purchase process are warranty, (2006) type, price, style, safety, running costs, re-sale, reputation, reliability, price, performance, model, fuel, country and comfort. Bennett and Mandell (1969) Consumer use reports, dealer visits, expert opinion, friends opinion reading brochures, discussion with spouse, auto show, advertisement, new articles, discussion with children as sources of information in the new car purchase decision process. Purchase decision making Newman and Staelin (1971) Half of the buyers of new cars and major appliances had purchase decision times of one or two weeks. Newman and Staelin (1972) Half of the buyers thought mainly of only one brand at the outset of the decision process. Cui, Lui and Guo (2012) Since the rise of the internet online WOM communication has become an important source of information for consumers planning to purchase new products. User generated content, such as Tweets, helps costumers make informed decisions about purchasing new products. Dichter (1966) If the consumer risks are high, WOM recommendations are one of the strongest influencers on product purchase decisions of consumers. Advertising cannot sell against personal influence. Twitter Huberman and Asuer (2010) Twitter has attracted lots of attention from organizations because of the huge potential it provides for viral marketing. Organizations are using Twitter to advertise products and spread information to stakeholders.

68