"Towards Re-Inventing Psychohistory": Predicting the Popularity of Tomorrow’S News from Yesterday’S Twitter and News Feeds
Total Page:16
File Type:pdf, Size:1020Kb
J SYST SCI SYST ENG Vol. 0, No. 0, 0, pp. 1–20 ISSN: 1004-3756 (paper), 1861-9576 (online) DOI: https://doi.org/10.1007/s11518-020-5470-4 CN 11-2983/N "Towards Re-Inventing Psychohistory": Predicting the Popularity of Tomorrow’s News from Yesterday’s Twitter and News Feeds Jiachen Sun,a, b Peter Gloora aMIT Center for Collective Intelligence, 245 First Street, 02142, Cambridge, MA, USA [email protected] () bSchool of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou, 510275, China [email protected] Abstract. Rapid advances in machine learning combined with wide availability of online social media have created considerable research activity in predicting what might be the news of tomorrow based on an analysis of the past. In this work, we present a deep learning forecasting framework which is capable to predict tomorrow’s news topics on Twitterand news feeds based on yesterday’s content and topic-interaction features. The proposed framework starts by generating topics from words using word embeddings and K-means clustering. Then temporal topic-networks are constructed where two topics are linked if the same user has worked on both topics. Structural and dynamic metrics calculated from networks along with content features and past activity, are used as input of a long short-term memory (LSTM) model, which predicts the number of mentions of a specific topic on the subsequent day. Utilizing dependencies among topics, our experiments on two Twitter datasets and the HuffPost news dataset demonstrate that selecting a topic’s historical local neighbors in the topic-network as extra features greatly improves the prediction accuracy and outperforms existing baselines. Keywords: Topic’s popularity, trend forecasting, social media 1. Introduction sales (Gruhl et al. 2005), stock prices (Choud- hury et al. 2008, Zhang et al. 2011), election out- “Since emotions are few and reasons many, the be- comes (Ebrahimi et al. 2017), and Oscar awards havior of a crowd can be more easily predicted than (Krauss et al. 2010). Today, thanks to the ad- the behavior of one person can. And that, in turn, vent of artificial intelligence and the boom in means that if laws are to be developed that enable social networking, we finally have reached the the current of history to be predicted, then one must point where we can tackle the general concept deal with large populations, the larger the better. of psychohistory foreseen by Asimov over 60 That might itself be the First Law of Psychohistory, years ago. In support of this trend, the notion the key to the study of Humanics. Yet." of crowd behavior prediction on social media, - Isaac Asimov, Robots and Empire such as what kind of topic people are going Inspired by “psychohistory” invented by to mention, and how many shares or page- the fictitious mathematician Hari Seldon in the views an online article or a video will receive, science fiction stories of Isaac Asimov, many re- has been widely studied. Obviously, the capa- searchers have tried to predict parts of the fu- bility to forecast online content popularity is ture using social media sources, although until of great significance to both theory and prac- now mostly limited to areas where solid time- tice. On the one hand, predicting tomorrow’s series of the past are available, such as book © Systems Engineering Society of China and Springer-Verlag GmbH Germany 2020 1 2 Sun and Peter: "Towards Re-Inventing Psychohistory": Predicting the Popularity of Tomorrow’s News from Yesterday’s Twitter and News Feeds news presents important opportunities to cap- popularity forecasting framework which can ture the evolution of a global audience’s atten- be applied to both social networking plat- tion around the massive and fast-growing con- forms and news feeds. In particular, we tents and topics available on social media and study the complex relationships among topics thus helps researchers better understand how through the structure of the underlying topic- people utilize the Internet in terms of collec- interaction networks. The proposed scheme tive human behavior. On the other hand, news for topic prediction involves word embed- organizations, politicians, and marketing ex- ding, unsupervised clustering, network struc- ecutives of companies would all like to know ture analysis and deep learning for time series what matters to their audiences tomorrow, to forecasting. Firstly, frequent words extracted target their messages in the best possible way to from original texts (e.g. tweets, news arti- suit their needs. For example, knowing what cles) are represented as high dimensional vec- kind of news topics will bring the most traf- tors using Word2Vec embeddings, which are fic on social media in the near future enables further clustered into a number of meaning- media participants to develop efficient market- ful topics by the K-means algorithm comple- ing strategies to either maximize the impact of mented with manually labelling. These well- their online articles or downgrade unpopular defined topics are then denoted as nodes in the content in advance. network where links between two topics are created if the same user has worked on both Given the high level of interest in online of them, such that a series of temporal topic- content popularity prediction, a number of re- networks can be created. After that we use searches (Szabo and Huberman 2010, Ahmed structure and dynamic properties of the net- et al. 2013, Tatar et al. 2011, Weng et al. 2014, work combined with information about con- Abbar et al. 2018) have been proposed by mea- tent and past activity to train a long short-term suring a topic or article’s popularity as user memory (LSTM) (Hochreiter and Schmidhu- participative activity such as user comments, ber 1997) model, which predicts the number votes, or the size of the crowd who men- of mentions of a specific topic on the subse- tion it. However, most of the proposed tech- quent day. Furthermore, we propose a simple niques merely rely on early popularity mea- but powerful approach to select relevant top- surements and consider the prediction of a sin- ics as additional features using the network’s gle topic or article’s popularity as a separate historical information about the topic’s local isolated task by neglecting the underlying con- neighbors. We show that using these top- nection between the predicted topic or article ics as additional features can significantly im- and other news items. Intuitively, the popu- prove the prediction results and outperforms larity of a topic or article is often correlated the existing approaches including the linear- with or even dependent on other relevant top- correlation feature selection method (Abbar et ics/articles. For example, nobody would talk al. 2018). Extensive experiments on two Twitter about #MakeAmericaGreatAgain, or the wall datasets and the HuffPost news feeds dataset to Mexico without Donald Trump. Therefore, demonstrate the effectiveness and efficiency of it would clearly be useful if we can incorporate the proposed scheme. the relevance of news topics in the predictive model to improve prediction performance. The remainder of this manuscript is orga- In this work, we develop a unified topic- nized as follows. In Section 2, we provide an Sun and Peter: "Towards Re-Inventing Psychohistory": Predicting the Popularity of Tomorrow’s News from Yesterday’s Twitter and News Feeds 3 overview of related work. Section 3 introduces been efforts towards building sophisticated the datasets used in this work. The method of predictive models incorporating various exoge- topic generation is given in Section 4. Section 5 nous features beyond the past activity such as presents the topic-network creation as well as user interaction, social network structure and visualization results from three datasets. The other characteristics of social media content. measurement of popularity and features are For instance, Tatar et al. (2011 2012) predicted provided in Section 6, along with the proposed the number of comments to news articles on a topic selection method. Section 7 introduces French news platform, based on earlier mea- the prediction task’s results. Finally, Section 8 surements such as publication date, category, contains conclusions and further applications section and number of comments. Bandari et and extensions. al. (2012) used the category of a news article to- gether with information about the communica- 2. Related Work tion source, language subjectivity, and named In this section, we summarize existing re- entities to predict the popularity of news ar- searches which are closely related to the study ticles before they go online. Regarding net- presented in this paper, grouped by the differ- work impact, Ruan et al. (2012) showed that ent features used in the predictive models. combining past activity with social network Single past time series. When predicting features (e.g. the amount of topic-related in- how active the crowd behavior in the future formation a user has received from his/her will be, a typical approach is to look at the sin- friends) can improve the prediction perfor- gle past time series of such activity and employ mance of a topic’s volume on Twitter. The ef- regression-based techniques. For instance, fect of a user-interaction network was also vali- Szabo and Huberman (2010) found a strong dated in Weng et al. (2014) where features from correlation between the logarithmically trans- basic network topology and community struc- formed popularities at early and later times. ture turned out to be the most effective pre- They showed that the total number of views dictors. Besides, emotion features conveyed in of YouTube videos and votes of Digg stories news headlines also demonstrated predictive can be predicted by measuring the observed power for news popularity (Gupta et al.