Strong Regularities in Growth and Decline of Popularity of Social Media Services
Total Page:16
File Type:pdf, Size:1020Kb
Strong Regularities in Growth and Decline of Popularity of Social Media Services Christian Bauckhage Kristian Kersting University of Bonn, TU Dortmund University, Fraunhofer IAIS Fraunhofer IAIS Bonn, Germany Dortmund, Germany ABSTRACT Google Trends Google Trends 100 shifted Gompertz 100 shifted Gompertz 80 80 We analyze general trends and pattern in time series that 60 60 40 40 characterize the dynamics of collective attention to social 20 20 media services and Web-based businesses. Our study is 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 based on search frequency data available from Google Trends (a) buzznet (b) failblog and considers 175 different services. For each service, we collect data from 45 different countries as well as global av- Google Trends Google Trends 100 shifted Gompertz 100 shifted Gompertz erages. This way, we obtain more than 8,000 time series 80 80 60 60 which we analyze using diffusion models from the economic 40 40 20 20 sciences. We find that these models accurately characterize 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 the empirical data and our analysis reveals that collective (c) flickr (d) librarything attention to social media grows and subsides in a highly regular and predictable manner. Regularities persist across Google Trends Google Trends 100 shifted Gompertz 100 shifted Gompertz regions, cultures, and topics and thus hint at general mech- 80 80 60 60 anisms that govern the adoption of Web-based services. We 40 40 discuss several cases in detail to highlight interesting find- 20 20 ings. Our methods are of economic interest as they may 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 inform investment decisions and can help assessing at what (e) studiVZ (f) wikipedia stage of the general life-cycle a Web service is at. Figure 1: Examples of Google Trends time series Categories and Subject Descriptors which summarize how worldwide searches for dif- ferent social media services evolve over time. Even G.3 [Probability and Statistics]: Time series analysis; though individual curves differ considerably, an ap- H.3.5 [Online Information Services]: Web-based services propriately parameterized diffusion model accounts well for the apparent general trends of initial growth General Terms and subsequent decline of interest. Results obtained Economics, Human Factors, Measurement from more than 8,000 temporal signatures of collec- tive attention on the Web indicate that these find- ings are universal and that interests of large crowds Keywords of users follow these patterns regardless of regional, social media services, collective attention, trend prediction cultural, or linguistics backgrounds. arXiv:1406.6529v1 [cs.SI] 25 Jun 2014 1. INTRODUCTION The problem of understanding the dynamics of collective spread of diseases [19], accounts of the propagation of news human attention has been called a key scientific challenge for items [4,5, 14], characterizations of the formation of politi- the information age [39]. In this paper, we address a spe- cal opinions [22], or predictions of tourism flows [2]. cific aspect of this problem and mine search frequency data Search frequencies are of particular interest in nowcast- for common trends and shared characteristics. Our focus is ing which aims at real time monitoring of economic trends on query logs which summarize the evolution of global and and developments [12]. Aggregated search behaviors of mil- regional interests in social media services and we explore lions of users yield reliable predictions for sales or general to what extend the general dynamics of collective attention economic indicators [13, 33]. Temporal changes in search apparent from these data can be modeled mathematically. volumes were found to correlate with changes in the behav- Search frequency analysis is an emerging topic and a grow- ior of investors [9, 16] and to allow for predicting abnormal ing body of work shows that patterns found in aggregated stock returns [18, 26]. Accordingly, analysts in the social search data of large populations of Web users can provide sciences, public health, or economics are beginning to em- insights into collective concerns, interests, or habits. Results brace query log analysis as an alternative to more traditional on temporal dynamics of search engine queries are reported methods. from various fields and include data driven models of the The work reported here originates from a project on Web intelligence where we ask for socio-economic motivations for individuals to participate in collective endeavors on the Web. Table 1: 45 countries considered in this study Regarding services, products, and campaigns we investigate Africa MA, NG, ZA approaches that would allow companies or marketeers to Asia CN, ID, IN, JP, KR, MY, PH, TH, TW recognize whether they need to adjust their strategies in or- Australia AU, NZ der to remain competitive in the modern Web environment. In particular, we ask to what extent it is possible to pre- Europe AT, BE, CH, CZ, DE, DK, ES, FI, FR, GR, IE, IL, IT, NL, NO, PL, PT, RU, SE, TR, dict the future success or adoption of services, products, or UA, UK marketing messages using collective Web intelligence? Our paradigm is to mine Web data for possible indicators N-America CA, MX, US of trends in collective attention. In this paper, we consider S-America AR, BR, CL, CO, PE, VE time series obtained from Google Trends which summarize search interests of millions of users worldwide and we focus on temporal signatures that characterize evolving interests in social media. Extending previously published work [6], our contributions are as follows: 1) We briefly review recent results which underline that Google Trends data provide meaningful and reliable proxies for research on how opinions and interests of large crowds and populations evolve over time. 2) We analyze search frequency data from 45 countries related to 175 social media services and Web businesses. Given this comprehensive empirical basis, we perform trend pertaining to its validity and the significance of search data analysis using economic diffusion models and find them to have been addressed in two recent contributions. be in excellent agreement with the data. In particular, we Mellon [34] correlated results from traditional Gallup sur- find that collective attention to social media as evident from veys with Google Trends data and found that, w.r.t. politi- search frequencies evolves according to notably regular pat- cal and economic issues covered in traditional opinion polls, terns. Although microscopic behaviors may be chaotic, gen- search frequencies provide accurate proxies of the dynamics eral trends apparent in these data typically show simple and of salient public opinions. Teevan et al. [38] studied how peo- highly regular dynamics of growth and decline. ple navigate the Web and found that over 25% of all queries 3) We present evidence that this phenomenon persists to search engines are navigational queries, i.e. searches for across regions, cultures, and linguistic backgrounds and we company names such as facebook, youtube, or myspace that elaborate on several particular examples to highlight sev- are intended to find and then access particular Web sites. eral interesting findings. We investigate the potential of our In other words, a large percentage of Web users consistently models for forecasting and present qualitative results which relies on Google searches rather than on bookmarks or on indicate that they indeed allow for reasonable predictions of entering URLs in order to navigate to Web sites. Together future developments of collective attention. these findings thus suggest that data from Google Trends Next, we discuss the empirical basis of our study. Sec- which aggregate information about the search activities of tion3 reviews models and methods applied for analysis; re- millions of users are indeed indicative of collective interests sults are discussed in section4. Section5 contrasts our work in Internet services, technical products, or novelties. to the related literature and section6 concludes this paper. 2.2 Data Collection and Preprocessing 2. SEARCH FREQUENCY DATA: A PROXY In this paper, we analyze global and regional temporal search statistics related to query terms such as ebay, face- OF COLLECTIVE ATTENTION book, or youtube that indicate a populations interest in so- Our overall goal is to proceed towards a better under- cial media services or Web-based businesses. For potentially standing of the dynamics of collective interests and concerns ambiguous queries, we retrieve data for different spellings of large populations of Web users. The empirical basis for (e.g. google plus, googleplus, google+, google +) and compute the work reported here consists of time series obtained from their average. In total, we consider data from 45 different Google Trends which indicate how search volumes related to countries related to 175 services. As we also retrieve corre- specific topics evolve over time. sponding global search activities, our empirical basis consists of more than 8.000 data sets. 2.1 Background The 45 countries considered in this study are listed in Google Trends is a publicly accessible service that pro- Tab.1. They were selected according to population size, vides statistics on queries users submitted to Google's search Internet penetration, and availability of query logs. Note engine. It allows for retrieving weekly summaries of how that this sample covers various regions, cultures, and official frequently a query has been used since January 1st 2004. languages and is deliberately not restricted to countries that Aggregated statistics are available in form of global aver- are technologically far advanced. ages but can be narrowed down to regional statistics, for The 175 social media sites and Web businesses we con- instance on the level of individual countries. sider are listed in Tab.2.