Temporal Preference Analysis in Recommender Systems
Total Page:16
File Type:pdf, Size:1020Kb
Temporal Preference Analysis In Recommender Systems Master’s Thesis 0,5 0,4 0,3 0,2 0,1 category_effect movie_effect maca_effect 0,0 -0,1 -0,2 -0,3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Marek Karczewski Temporal Preference Analysis In Recommender Systems THESIS submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE by Marek Karczewski born in Warsaw, Poland Web Information Systems Group Department of Software Technology Faculty EEMCS, Delft University of Technology Delft, the Netherlands www.wis.ewi.tudelft.nl c 2010 Marek Karczewski. Coverpicture: Temporal Effects Temporal Preference Analysis In Recommender Systems Author: Marek Karczewski Student id: 9422407 Email: [email protected] Abstract This thesis presents the results of research into temporal preference analysis in recommender systems. Temporal preference analysis consists of methods for detecting time recurrent changes in user preferences and for using this informa- tion to improve recommendation precision. Thesis Committee: Chair: Prof.Dr.Ir. G.J.P.M. Houben, Faculty EEMCS, TUDelft University supervisor: Dr.Ir. A.J.H. Hidders, Faculty EEMCS, TUDelft Committee Member: Ir. H.J.A.M. Geers, Faculty EEMCS, TUDelft Preface This thesis could not have been completed without the love and support of a number of people. First and foremost I would like to express my gratitude to the wonder- ful women in my life. Dear Mother, thank You for supporting and motivating me throughout the years. Dear Alexandra, words come short describing Your patience and dedication. I love You both very much. I am much obliged to my supervisor Dr Jan Hidders. Thank You for providing the guidance and advice to keep the research on track. Your help with formalizing the theoretical basis for performing temporal preference analysis is greatly appreciated as well. My thank also goes to Professor Geert-Jan Houben, head of the faculty of Web Information Systems at the Technical University of Delft. Thank You for admitting me to Your group and allowing me to perform my master’s research as part of it. Special thank goes to Dr Jaap Aalders. The many talks we had on the cyclic nature of natural phenomena turned out to be more valuable than I could foresee. Dear Yehuda Koren, thank You for explaining some details pertaining to Your method for generating recommendations with temporal dynamics. Dear Dr. Mick Flanagan, thank You for Your support with using the Java Scientific Library package for performing linear regression analysis. To all who it may concern; I wish You a pleasant and interesting read. Marek Karczewski Delft, the Netherlands November 17, 2010 iii Contents Preface iii Contents v List of Figures vii 1 Introduction 1 1.1 Context . 2 1.2 Temporal preference analysis . 7 1.3 Research objectives . 9 1.4 Outline . 11 2 Preparations 13 2.1 Theory of recommender systems and temporal dynamics . 13 2.2 Theory of temporal preference analysis . 16 2.3 Implementation . 20 3 Results 33 3.1 Measuring temporal effects . 33 3.2 Applying temporal effects . 38 3.3 Recommendation precision . 42 4 Conclusions and Future Work 49 4.1 Contributions . 49 4.2 Reflection . 50 4.3 Future work . 55 4.4 Final conclusions . 57 Bibliography 59 A Interest period parameters 63 v List of Figures 1.1 Information Retrieval . 5 1.2 Clustering . 8 2.1 Notation . 19 2.2 Main Classes . 22 2.3 User Interface . 24 3.1 Rating Series Regions . 33 3.2 Gap Length . 35 3.3 Last Period Length . 36 3.4 Total Length of Rating Series . 37 3.5 Maca Effect Count . 46 4.1 Distribution of Maca Effects . 51 4.2 Positivity and Last Length . 52 4.3 Gap Length . 52 A.1 Gap Neutrals . 63 A.2 Gap Negatives . 64 A.3 Gap Average . 64 A.4 Last Average . 65 A.5 Last Frequency . 65 A.6 Last Negatives . 66 A.7 Last Neutrals . 66 A.8 Interest Periods . 67 A.9 Disinterest Periods . 67 vii Chapter 1 Introduction In the past thirty years we have seen major technological advances that have led to an explosive growth of information processing systems. Some of these systems, for instance those used in hospitals, may have limited connectivity with the outside world and are intended for specialized use only. But the majority of present-day computer systems are interconnected by a common network called the Internet. In developed countries such as the Netherlands, access to the Internet is quite common. According to statistics from 2008, more than eighty percent of all Dutch citizens over sixteen years of age make use of the Internet. We can trace back the start of the Internet as a specialized network for providing robust communication between military computer systems. Nowadays the Internet is a network connecting mostly open systems that benefit a great number of people. The importance of this network might best be exemplified by the influence it has on the natural language. Not only do we write “Internet” with a capital letter in English, but we also thank a number of new expressions to it. Surfing, blog, cyberspace, cookie, browser and downloading are just a few examples of words related to different aspects of the Internet. Many areas of human activity such as commerce, entertainment and education are represented by various services on the Internet. A user of the Internet can visit websites such as Amazon or DealExtreme to do his shopping from the leisure of his home. He can read news on Onet or CNN and he can study unknown facts on the Wikipedia. Video enthusiasts can visit YouTube, a popular online service for viewing and publicizing video content. The Internet is not restricted to websites. There are many dedicated applications that make use of the Internet for communication. Skype, a popular communicator, allows users to talk with each other free of charge over the Internet. Fans of chess or go can play their favourite game on a Internet server by using dedicated client applica- tions. The Internet also allows for easy file exchange, either by using the standard file transfer protocol, or by using a dedicated file sharing network like BT2Net or Gnutella. The progress made in recent years introduces specific engineering challenges. The abundance of available information and usage options in modern computer systems often leads to various usability difficulties. In the context of this research we can formulate the following problem statement: Problem Statement. The amount of stored and retrievable information in present-day 1 1.1 Context Introduction computer systems causes difficulties in finding relevant information in a timely manner. Specialized computer systems called information retrieval systems can assist the user in finding useful information. This thesis focuses on a special kind of information retrieval systems called recommender systems. A recommender system presents items to the user in the form of personalized recommendations. In most cases, the user can provide feedback on the recommended items by giving them a rating. A characteris- tic feature of recommender systems is the ability to store the user’s preferences and filter the presented information accordingly. Recommender systems can thus alleviate the problem of finding relevant information by presenting interesting items to users, while minimizing the effort required to inform the system about individual needs for information. A recommender system is only as useful as it is capable of producing accurate predictions of future ratings. This ability is called recommendation precision. The more precise a recommender system is, the more it is capable to correctly identify the items that should appeal to the user, i.e. receive high ratings. In the quest to produce better recommender systems, an improvement of recom- mendation precision is one of the main goals. Changes in user preferences can hinder the precise generation of recommenations.. The research presented in this thesis is mo- tivated by the desire to increase recommendation precision, while dealing with changes in user preferences. We can state: Main Objective. The main objective of our research is to investigate the possibility of improving recommendation precision by analysing and acting upon cyclic changes in user preferences that take place over time. Before formulating more specific research objectives, let’s first present some nec- essary background information. We’ll start by examining the role that information and information processing technology plays in our daily lives. 1.1 Context 1.1.1 The need for information Our understanding of a studied subject, its place in the world and its relation to other subjects, is formed by processing information. The English word “information” is de- rived from the Latin verb "informare" meaning “to give form to the mind”, “to disci- pline”, “instruct” or “teach” [1]. The role of information is to shape our understanding of the surrounding world, our place in it, our relationships with other people and our choices regarding future conduct. Without this knowledge it would be difficult to at- tain to our needs in a proper way. It is important therefore, that the information we are receiving is both correct and relevant to the problems that we are trying to solve. While informational correctness depends on the quality of the source, informa- tional relevance depends on individual needs. Different people will show different needs for information, depending on their age, occupation, interests, living conditions and other factors. A person who is interested in the history of Mexico will not be helped by a course in linear algebra. Furthermore, the level of detail or abstraction of 2 Introduction 1.1 Context information concerning a particular area of interest should reflect the level of under- standing of the subject.