Instant Foodie: Predicting Expert Ratings From Grassroots Chenhao Tan∗ Ed H. Chi David Huffaker Cornell University Google Inc. Google Inc. Ithaca, NY Mountain View, CA Mountain View, CA
[email protected] [email protected] [email protected] Gueorgi Kossinets Alexander J. Smola Google Inc. Google Inc. Mountain View, CA Mountain View, CA
[email protected] [email protected] ABSTRACT Categories and Subject Descriptors Consumer review sites and recommender systems typically rely on H.2.8 [Database Management]: Data Mining; H.3.m a large volume of user-contributed ratings, which makes rating ac- [Information Storage and Retrieval]: Miscellaneous; J.4 [Social quisition an essential component in the design of such systems. and Behavioral Sciences]: Miscellaneous User ratings are then summarized to provide an aggregate score representing a popular evaluation of an item. An inherent problem Keywords in such summarization is potential bias due to raters’ self-selection Restaurant ratings; Google Places; Zagat Survey; Latent variables and heterogeneity in terms of experiences, tastes and rating scale interpretations. There are two major approaches to collecting rat- ings, which have different advantages and disadvantages. One is 1. INTRODUCTION to allow a large number of volunteers to choose and rate items di- The aggregate or average scores of user ratings carry signifi- rectly (a method employed by e.g. Yelp and Google Places). Al- cant weight with consumers, which is confirmed by a considerable ternatively, a panel of raters may be maintained and invited to rate body of work investigating the impact of online reviews on product a predefined set of items at regular intervals (such as in Zagat Sur- sales [5, 7, 6, 11, 23, 41, 24].