Personalized News Recommendation Based on Click Behavior Jiahui Liu, Peter Dolan, Elin Rønby Pedersen Google Inc

Total Page:16

File Type:pdf, Size:1020Kb

Personalized News Recommendation Based on Click Behavior Jiahui Liu, Peter Dolan, Elin Rønby Pedersen Google Inc Personalized News Recommendation Based on Click Behavior Jiahui Liu, Peter Dolan, Elin Rønby Pedersen Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043, USA {jiahui, peterdolan, elinp}@google.com ABSTRACT to thousands of sources via the internet. News aggregation Online news reading has become very popular as the web websites, like Google News and Yahoo! News, collect news provides access to news articles from millions of sources from various sources and provide an aggregate view of around the world. A key challenge of news websites is to news from around the world. A critical problem with news help users find the articles that are interesting to read. In service websites is that the volumes of articles can be this paper, we present our research on developing overwhelming to the users. The challenge is to help users personalized news recommendation system in Google find news articles that are interesting to read. News. For users who are logged in and have explicitly enabled web history, the recommendation system builds Information filtering is a technology in response to this profiles of users’ news interests based on their past click challenge of information overload in general. Based on a behavior. To understand how users’ news interest change profile of user interests and preferences, systems over time, we first conducted a large-scale analysis of recommend items that may be of interest or value to the anonymized Google News users click logs. Based on the user. Information filtering plays a central role in log analysis, we developed a Bayesian framework for recommender systems, as it is able to recommend predicting users’ current news interests from the activities information that has not been rated before and of that particular user and the news trends demonstrated in accommodates the individual differences between users [3, the activity of all users. We combine the information 8]. Information filtering has been applied in various filtering mechanism using learned user profiles with an domains, such as email [16], news [4, 5, 20], and web existing collaborative filtering mechanism to generate search [15, 18]. In the domain of news, this technology personalized news recommendations. The combined particularly aims at aggregating news articles according to method was deployed in Google News. Experiments on the user interests and creating a “personal newspaper” for each live traffic of Google News website demonstrated that the user. combined method improves the quality of news An accurate profile of users' current interests is critical for recommendation and increases traffic to the site. the success of information filtering systems. Some systems [1, 19] require users to manually create and update profiles. Author Keywords This approach places an extra burden on users, something Personalization, user modeling, news trend. very few are willing to take on. Instead, systems can construct profiles automatically from users' interaction with ACM Classification Keywords the system. H5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous. In this paper, we describe our research on developing a personalized news recommendation system based on INTRODUCTION profiles learned from user activity in Google News. The News reading has changed with the advance of the World Google News website, available at http://news.google.com, Wide Web, from the traditional model of news is one of the most popular news websites in the world, consumption via physical newspaper subscription to access receiving millions of page views and clicks from users around the world The nature of news reading makes news information Permission to make digital or hard copies of all or part of this work for filtering distinctive from information filtering in other personal or classroom use is granted without fee provided that copies are domains. When visiting a news website, the user is looking not made or distributed for profit or commercial advantage and that copies for new information, information that she did not know bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior before, that may even surprise her. Since user profiles are specific permission and/or a fee. inferred from past user activity, it is important to know how CHI 2009, April 4–9, 2009, Boston, MA, USA. users’ news interests change over time and how effective it Copyright 2009 ACM 978-1-60558-246-7/08/04…$5.00 1 would be to use the past user activities to predict their “Recommended for [account]”, containing stories future behavior. recommended based on her click history in Google News. For our analysis, the recorded click histories were fully To understand this issue, we conducted a large scale log anonymized and kept secure according to the Google analysis of Google News users to measure the stability of Privacy Policy. users’ news interests. We found that their interests do vary over time but follow the aggregate trend of news events. A previous Google News recommendation system was Based on these findings, we develop a Bayesian model to developed using a collaborative filtering method [7]. It predict the news interests of an individual user from the recommends news stories that were read by users with activities of that particular user and the news trend similar click history. This method has two major drawbacks demonstrated in activities of a group of users. To in recommending news stories. First, the system cannot recommend news stories to users, the system takes into recommend stories that have not yet been read by other account of the genuine interests of individual user and the users, a problem that is often referred to as the first-rater current news trend. Therefore, the user will receive news problem [7, 8]. For news recommendations, this is a serious tailored to her interests without missing the important news problem, as news service websites strive to present the most events, even when those events do not strictly match the updated information to users in a timely manner. News user’s particular interests. articles presented in Google News are usually published within one hour. However, the collaborative filtering We combined the news information filtering method with method has to wait several hours to collect enough clicks to the collaborative filtering method previously developed for recommend the news story to users, resulting in undesirable Google News [7] to generate personalized time lags between break-out news and recommendations. recommendations for news access. The combined method Second, not all users are equal to each other, and the was evaluated in a live experiment: a subset of the live collaborative filtering method may not account for the traffic at Google News used the combined method; the individual variability between users [3]. For example, we result showed significant improvement over the existing observed that entertainment news stories are constantly collaborative filtering method. The experiment on live recommended to most of the users, even for those users traffic also revealed a number of interesting issues related who never clicked on entertainment stories. The reason is to recommendations, serendipitous exploration and user that entertainment news stories are generally very popular, satisfaction. We will discuss these issues later in this paper. thus there are always enough clicks on entertainment stories The contribution of this paper is three-fold. First, we report from a user’s “neighbors” to make the recommendation. a large-scale log analysis of the consistency of users' news A solution to these two problems would be to build profiles interest. Second, we propose a novel method for predicting of user’s genuine interests and use them to make news a user’s news interests based on click behavior which recommendations. The profiles would help the system filter combines the genuine interests of the user and the current out the stories that are not of interest to the user, such as the news trend. Third, we combined information filtering and entertainment news mentioned above. A news story may collaborative filtering for personalized news also be recommended to the user if it matches her interest, recommendation and ran an experiment on the live traffic, even if the story has not been clicked on by other users. showing improved results. In this paper, we describe an information filter method to PERSONALIZATION IN GOOGLE NEWS recommend news articles based on their topic categories, Google News is a computer-generated news website that which is assigned by text classifiers. Based on a user’s aggregates headlines from news sources worldwide. It news reading history the information filtering component classifies news articles into different topic categories (e.g. predicts the topic categories of interest to her each time she “world”, “sport”, “entertainment”, etc.) and displays them visit Google News. News articles in those categories are in corresponding sections, as do standard news websites, ranked higher in the candidate list and will be but with fully automated text-based classification. Google recommended to the user. We chose to recommend news News serves millions of users around the world, and stories at the general level of topic categories instead of fine provides numerous editions for different countries and grained topics because of the nature of news reading: most languages. users visit news websites with the attitude of “show me something interesting,” rather than having any specific Users usually visit Google News starting from the information goals [7]. Over-specializing the user profile homepage. The homepage of the standard edition has the may limit the recommendations to news that the user Top Stories section on the top of the homepage, followed already knew, which is obviously undesirable for news by topic based sections of news articles, like “world” and reading. “sport”. The user activities that Google News records are the user’s If a user signs in to her Google Account and explicitly clicks on the Google News website.
Recommended publications
  • The State of the News: Texas
    THE STATE OF THE NEWS: TEXAS GOOGLE’S NEGATIVE IMPACT ON THE JOURNALISM INDUSTRY #SaveJournalism #SaveJournalism EXECUTIVE SUMMARY Antitrust investigators are finally focusing on the anticompetitive practices of Google. Both the Department of Justice and a coalition of attorneys general from 48 states and the District of Columbia and Puerto Rico now have the tech behemoth squarely in their sights. Yet, while Google’s dominance of the digital advertising marketplace is certainly on the agenda of investigators, it is not clear that the needs of one of the primary victims of that dominance—the journalism industry—are being considered. That must change and change quickly because Google is destroying the business model of the journalism industry. As Google has come to dominate the digital advertising marketplace, it has siphoned off advertising revenue that used to go to news publishers. The numbers are staggering. News publishers’ advertising revenue is down by nearly 50 percent over $120B the last seven years, to $14.3 billion, $100B while Google’s has nearly tripled $80B to $116.3 billion. If ad revenue for $60B news publishers declines in the $40B next seven years at the same rate $20B as the last seven, there will be $0B practically no ad revenue left and the journalism industry will likely 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 disappear along with it. The revenue crisis has forced more than 1,700 newspapers to close or merge, the end of daily news coverage in 2,000 counties across the country, and the loss of nearly 40,000 jobs in America’s newsrooms.
    [Show full text]
  • Android Turn Off Google News Notifications
    Android Turn Off Google News Notifications Renegotiable Constantine rethinking: he interlocks his freshmanship so-so and wherein. Paul catapult thrillingly while agrarian Thomas don phrenetically or jugulate moreover. Ignescent Orbadiah stilettoing, his Balaamite maintains exiles precious. If you click Remove instead, this means the website will be able to ask you about its notifications again, usually the next time you visit its homepage, so keep that in mind. Thank you for the replies! But turn it has set up again to android turn off google news notifications for. It safe mode advocate, android turn off google news notifications that cannot delete your android devices. Find the turn off the idea of android turn off google news notifications, which is go to use here you when you are clogging things online reputation and personalization company, defamatory term that. This will take you to the preferences in Firefox. Is not in compliance with a court order. Not another Windows interface! Go to the homepage sidebar. From there on he worked hard and featured in a series of plays, television shows, and movies. Refreshing will bring back the hidden story. And shortly after the Senate convened on Saturday morning, Rep. News, stories, photos, videos and more. Looking for the settings in the desktop version? But it gets worse. Your forum is set to use the same javascript directory for all your themes. Seite mit dem benutzer cookies associated press j to android have the bell will often be surveilled by app, android turn off google news notifications? This issue before becoming the android turn off google news notifications of android enthusiasts stack exchange is granted permission for its notification how to turn off google analytics and its algorithms.
    [Show full text]
  • The Future of Voice and the Implications for News (Report)
    DIGITAL NEWS PROJECT NOVEMBER 2018 The Future of Voice and the Implications for News Nic Newman Contents About the Author 4 Acknowledgements 4 Executive Summary 5 1. Methodology and Approach 8 2. What is Voice? 10 3. How Voice is Being Used Today 14 4. News Usage in Detail 23 5. Publisher Strategies and Monetisation 32 6. Future Developments and Conclusions 40 References 43 Appendix: List of Interviewees 44 THE REUTERS INSTITUTE FOR THE STUDY OF JOURNALISM About the Author Nic Newman is Senior Research Associate at the Reuters Institute and lead author of the Digital News Report, as well as an annual study looking at trends in technology and journalism. He is also a consultant on digital media, working actively with news companies on product, audience, and business strategies for digital transition. Acknowledgements The author is particularly grateful to media companies and experts for giving their time to share insights for this report in such an enthusiastic and open way. Particular thanks, also, to Peter Stewart for his early encouragement and for his extremely informative daily Alexa ‘flash briefings’ on the ever changing voice scene. The author is also grateful to Differentology and YouGov for the professionalism with which they carried out the qualitative and quantitative research respectively and for the flexibility in accommodating our complex and often changing requirements. The research team at the Reuters Institute provided valuable advice on methodology and content and the author is grateful to Lucas Graves and Rasmus Kleis Nielsen for their constructive and thoughtful comments on the manuscript. Also thanks to Alex Reid at the Reuters Institute for keeping the publication on track at all times.
    [Show full text]
  • How to Get Your Website Added to Google News and Drive Real Time Traffic
    How to Get Your Website Added to Google News and Drive Real Time Traffic ● Let’s look at some of the best practices for getting added to Google News and how you can get real-time traffic. Adhere to the Principles of Good Journalism ● If you look at recent additions to the Google News syndication platform, you’ll notice that Google is no longer 100% focused on news-related “current events”- type content. ● It’s evolved over the years, leveling the playing field for bloggers, content creators and news media experts. This evolution may not be obvious from the headlines, but the content reveals this expansion. ● However, the principles of good journalism haven’t been discarded. Google still cares about the style and substance of articles. Good journalism is all about being honest and as objective as possible. ● Why do you think Google crawls, indexes and publishes content from CNN, BBC, Techcrunch, The Wall Street Journal and others? ● One of the reasons is because these sites adhere to strict standard journalism practices. They’re transparent and they adhere to the same professional standard. ● Your reporting must be original, honest and well-structured. Standard journalism is all about investigation. So, you should be able and ready to investigate a story and authenticate it, before reporting it. ● For your story to strike a chord with editors, who will in turn syndicate it at Google News, PBS recommends that you present information from the most to the least important content points. ● There’s an established application process to get your stories featured on Google News.
    [Show full text]
  • What Do News Aggregators Do? Evidence from Google News in Spain and Germany*
    What Do News Aggregators Do? Evidence from Google News in Spain and Germany* Joan Calzada† Ricard Gil‡ December 2018 Abstract The impact of aggregators on news outlets is ambiguous. In particular, the existing theoretical literature highlights that although aggregators create a market expansion effect when they bring visitors to news outlets, they also generate a substitution effect if some visitors switch from the news outlets to the aggregators. Using the shutdown of the Spanish edition of Google News in December of 2014 and difference-in-differences methodology, this paper empirically examines the relevance of these two effects. We show the shutdown of Google News in Spain decreased the number of daily visits to Spanish news outlets between 8% and 14%, and that this effect was larger in outlets with less overall daily visits and a lower share of international visitors. We also find evidence suggesting that the shutdown decreased online advertisement revenues and advertising intensity at news outlets. We then analyze the effect of the opt-in policy adopted by the German edition of Google News in October of 2014. Although such policy did not significantly affect the daily visits of all outlets that opted out, it reduced by 8% the number of visits of the outlets controlled by the publisher Axel Springer. Our results demonstrate the existence of a net market-expansion effect through which news aggregators increase consumers' awareness of news outlets' contents, thereby increasing their number of visits. * We thank Shane Greenstein, Avi
    [Show full text]
  • Google Benefit from News Content
    Google Benefit from News Content Economic Study by News Media Alliance June 2019 EXECUTIVE SUMMARY: The following study analyzes how Google uses and benefits from news. The main components of the study are: a qualitative overview of Google’s usage of news content, an analysis of news content on Google Search, and an estimate of revenue Google receives from news. I. GOOGLE QUALITATIVE USAGE OF NEWS ▪ News consumption increasingly shifts towards digital (e.g., 93% in U.S. get some news online) ▪ Google has increasingly relied on news to drive consumer engagement with its products ▪ Some examples of Google investment to drive traffic from news include: o Significant algorithmic updates emphasize news in Search results (e.g., 2011 “Freshness” update emphasized more recent search results including news) ▪ Google News keeps consumers in the Google ecosystem; Google makes continual updates to Google News including Subscribe with Google (introduced March 2018) ▪ YouTube increasingly relies on news: in 2017, YouTube added “Breaking News;” in 2018, approximately 20% of online news consumers in the US used YouTube for news ▪ AMPs (accelerated mobile pages) keep consumers in the Google ecosystem II. GOOGLE SEARCH QUANTITATIVE USAGE OF NEWS CONTENT A. Key statistics: ▪ ~39% of results and ~40% of clicks on trending queries are news results ▪ ~16% of results and ~16% of clicks on the “most-searched” queries are news results B. Approach ▪ Scraped the page one of desktop results from Google Search o Daily scrapes from February 8, 2019 to March 4, 2019
    [Show full text]
  • Copyright, Safe Harbors, and International Law
    \\server05\productn\N\NDL\84-1\NDL106.txt unknown Seq: 1 10-DEC-08 7:42 OPTING OUT OF THE INTERNET IN THE UNITED STATES AND THE EUROPEAN UNION: COPYRIGHT, SAFE HARBORS, AND INTERNATIONAL LAW Hannibal Travis* INTRODUCTION .................................................. 332 R I. THE DEVELOPMENT OF “WEB 2.0” ......................... 336 R II. COPYRIGHT LAW IN THE UNITED STATES AND THE EUROPEAN UNION ................................................... 339 R A. Exclusive Rights ....................................... 339 R B. Fair Use .............................................. 341 R C. Other Exceptions to Copyright ........................... 342 R III. OPTING OUT OF THE INTERNET IN THE UNITED STATES ..... 343 R A. The Early Cases: Threats of “Unreasonable” Liability....... 343 R B. The “Safe Harbors” of the Digital Millennium Copyright Act 347 R C. An Emerging Consensus That Copyright Holders Must Opt In to the Internet? ..................................... 350 R D. The Shift Toward an Opt-Out Copyright System in the Napster Case ......................................... 354 R E. The Internet Matures: Courts Move to an Opt-Out Copyright System ................................................ 358 R 2008 Hannibal Travis. Individuals and nonprofit institutions may reproduce and distribute copies of this Article in any format, at or below cost, for educational purposes, so long as each copy identifies the author, provides a citation to the Notre Dame Law Review, and includes this provision and copyright notice. * Visiting Associate Professor
    [Show full text]
  • Google News Google News Is an App That Helps You Find the Top News Stories
    Google News Google News is an app that helps you find the top news stories. Presented by With funding from Google News 1 Find the Google News app 2 Google News homepage Ask your adult learner to find the Google News app. They can look for it on the device’s home screen or in the app Explain to your adult learner that this launcher. is the homepage for Google News. The Ask your adult learner to tap on the homepage has stories that Google thinks you’ll be interested in. Google News app to open it Explain that they can also search for If Google News is not installed on the any news topic using the search device, ask your adult learner to open button in the top left corner the Google Play Store. They can search Ask your adult learner to search for for Google News and download the app. any kind of news story. For example, If your learner has never downloaded environment. apps before, do this lesson first. 3 Headlines 4 Canada and world news Explain to your adult learner that the top of Google News has different news categories Ask your adult learner to tap the Get your adult learner to tap on Headlines button on the bottom of the Canada Google News app Explain that this is where they can Explain that Headlines contains top news see the top news in Canada stories Ask them to swipe the news Ask your adult learner to scroll down categories from right to left to see and see other news articles in the other news categories, such as Sports Headlines section and Health 5 Local news 6 Change the language Ask your adult learner to tap the Explain to your adult learner that Following button on the menu they can also change the language Ask them to scroll down to the Local that they read the news in section.
    [Show full text]
  • Google Data Collection —NEW—
    Digital Content Next January 2018 / DCN Distributed Content Revenue Benchmark Google Data Collection —NEW— August 2018 digitalcontentnext.org CONFIDENTIAL - DCN Participating Members Only 1 This research was conducted by Professor Douglas C. Schmidt, Professor of Computer Science at Vanderbilt University, and his team. DCN is grateful to support Professor Schmidt in distributing it. We offer it to the public with the permission of Professor Schmidt. Google Data Collection Professor Douglas C. Schmidt, Vanderbilt University August 15, 2018 I. EXECUTIVE SUMMARY 1. Google is the world’s largest digital advertising company.1 It also provides the #1 web browser,2 the #1 mobile platform,3 and the #1 search engine4 worldwide. Google’s video platform, email service, and map application have over 1 billion monthly active users each.5 Google utilizes the tremendous reach of its products to collect detailed information about people’s online and real-world behaviors, which it then uses to target them with paid advertising. Google’s revenues increase significantly as the targeting technology and data are refined. 2. Google collects user data in a variety of ways. The most obvious are “active,” with the user directly and consciously communicating information to Google, as for example by signing in to any of its widely used applications such as YouTube, Gmail, Search etc. Less obvious ways for Google to collect data are “passive” means, whereby an application is instrumented to gather information while it’s running, possibly without the user’s knowledge. Google’s passive data gathering methods arise from platforms (e.g. Android and Chrome), applications (e.g.
    [Show full text]
  • Top 9 Free Google Tools
    Top 9 Google Tools for Your Business Websites 4 Small Business Tel: 02 9907 7777 web4business.com.au Google is mostly known for 1. Google Analytics its search capabilities, but If you have a website, Google Analytics can did you know there are track all your traffic, referrals, ads, sales and many other services that conversions. It gives you insight into how visitors behave on your website: Google offers and most of them are free – tools - which page they arrived on, beyond Google Maps, - how long they stay, - how many pages they view, Gmail and YouTube. - what search words they use to find you Check out the following and so much more. excellent products which It also lets you see how effective your social may help run your business media efforts are and which parts of your more efficiently. website are performing well. To install Google Analytics, all you (or your web developer) need to do is place some simple HTML code into your site and the tracking begins immediately. Find out more at: www.google.com/analytics/ Google Analytics not only lets you measure sales and conversions, but also gives you fresh insights into how visitors use your site, how they arrived on your site, and how you can keep them coming back.. 2. Google Alerts Google Alerts lets you set up email updates of the latest relevant Google results, so you can monitor current news stories, keep an eye on competitors or the industry and to find out what others may be saying about you or your products and services.
    [Show full text]
  • Adding Realtime Coverage to the Google Knowledge Graph
    Adding Realtime Coverage to the Google Knowledge Graph Thomas Steiner1?, Ruben Verborgh2, Raphaël Troncy3, Joaquim Gabarro1, and Rik Van de Walle2 1 Universitat Politècnica de Catalunya – Department lsi, Spain {tsteiner,gabarro}@lsi.upc.edu 2 Ghent University – IBBT, ELIS – Multimedia Lab, Belgium {ruben.verborgh,rik.vandewalle}@ugent.be 3 EURECOM, Sophia Antipolis, France [email protected] Abstract. In May 2012, the Web search engine Google has introduced the so-called Knowledge Graph, a graph that understands real-world entities and their relationships to one another. Entities covered by the Knowledge Graph include landmarks, celebrities, cities, sports teams, buildings, movies, celestial objects, works of art, and more. The graph enhances Google search in three main ways: by disambiguation of search queries, by search log-based summarization of key facts, and by explo- rative search suggestions. With this paper, we suggest a fourth way of enhancing Web search: through the addition of realtime coverage of what people say about real-world entities on social networks. We report on a browser extension that seamlessly adds relevant microposts from the social networking sites Google+, Facebook, and Twitter in form of a panel to Knowledge Graph entities. In a true Linked Data fashion, we inter- link detected concepts in microposts with Freebase entities, and evaluate our approach for both relevancy and usefulness. The extension is freely available, we invite the reader to reconstruct the examples of this paper to see how realtime opinions may have changed since time of writing. 1 Introduction With the introduction of the Knowledge Graph, the search engine Google has made a significant paradigm shift towards “things, not strings” [3], as a post on the official Google blog states.
    [Show full text]
  • GNI Youtube Sustainability Lab Project Proposal
    GNI YouTube Sustainability Lab Project Proposal Instructions ● Please note that for news organizations selected for participation in the program, this proposal may be incorporated into the contractual agreement. ● Questions? Email us at [email protected]. Organization information 1. Please​ ​provide​ ​your​ ​organization’s​ ​name​ ​and​ ​parent​ ​company​ ​name​ ​(if applicable).* 2. Where​ ​is​ ​your​ ​company​ ​incorporated​ ​(or,​ ​if​ ​not​ ​incorporated,​ ​where​ ​is​ ​it registered,​ ​or​ ​if​ ​not​ ​registered,​ ​where​ ​does​ ​it​ ​have​ ​principal​ ​place​ ​of​ ​business)?* 3. Please provide your organization’s physical location. Please include the street address, postal code, city, state or province, and country.* 4. What is your organization's website URL?* 5. What type below most accurately describes your organization?* a. Broadcaster b. Traditional publisher c. Digital publisher d. Agency e. YouTube creator f. Other 6. If you are applying as part of a collaboration with other organization(s), please specify your partner’s or partners’ names. 7. Please​ ​briefly​ ​describe​ ​your​ ​organization or, if you are applying as a collaboration, of all organizations.* (1000 characters) 8. Are you, or any member of your organization (or any organization part of your collaboration), currently employed by or working on behalf of a government organization, in any capacity? Or, is your news organization (or any organization part of your collaboration) funded by the government, or does any government organization hold any level of ownership, control, or voting rights?* a. Yes b. No 9. If yes, please provide details on the nature of your or your organization’s relationship with the government. (1000 characters) 10. Please provide a link to your organization’s YouTube channel.
    [Show full text]