Using Meta-Data from Free-Text User
Total Page:16
File Type:pdf, Size:1020Kb
USING META-DATA FROM FREE-TEXT USER- GENERATED CONTENT TO IMPROVE PERSONALIZED RECOMMENDATION BY REDUCING SPARSITY XU XIAOYING (B.Eng. (Hons.), South China University of Technology) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF INFORMATION SYSTEMS NATIONAL UNIVERSITY OF SINGAPORE 2015 DECLARATION I hereby declare that this thesis is my original work and it has been written by me in its entirety. I have duly acknowledged all the sources of information which have been used in the thesis. This thesis has also not been submitted for any degree in any university previously. __________________________________________ Xu Xiaoying 28 May 2015 I ACKNOWLEDGEMENTS My deepest gratitude goes first and foremost to Professor Anindya Datta, my supervisor, for his constant guidance and encouragement. This thesis could not have been accomplished without his illuminating instruction. The days working with him have become a memorable journey in my life. Second, I would like to express my heartfelt gratitude to my dear friends and my fellow classmates in NUS who gave me the kindest help. I will never forget the days spent with them. Last I want to thank my beloved parents and girlfriend, for their loving consideration and great patience in the past few years. I dedicate this thesis to them. II TABLE OF CONTENTS CHAPTER 1. INTRODUCTION ................................................................. 1 1.1. Overview of Recommender Systems............................................... 1 1.2. Problem Description ...................................................................... 2 1.3. Motivation and Research Focuses ................................................. 4 1.4. Contribution ................................................................................... 7 1.5. Organization of Thesis ................................................................... 9 CHAPTER 2. LITERATURE REVIEW ................................................... 10 2.1. Collaborative Filtering (CF) Recommendation ........................... 10 2.1.1 Memory-Based CF ............................................................... 10 2.1.2 Model-Based CF .................................................................. 11 2.1.3 Graph-Based CF .................................................................. 12 2.2. Content-Based (CB) Recommendation ........................................ 13 2.3. Social-Network-Based (SNB) Recommendation .......................... 15 2.4. User-Generated-Content (UGC) in Recommendation ................. 16 CHAPTER 3. STUDY ON ADDRESSING SPARSITY AND TRANSPARENCY ISSUES IN RECOMMENDER SYSTEMS BY USING ADJECTIVE FEATURES FROM USER REVIEWS .................. 19 3.1. Introduction .................................................................................. 19 3.2. Related Work ................................................................................ 22 3.2.1. Adjective Extraction ............................................................. 23 3.2.2. Singular Value Decomposition ............................................. 24 3.3. Intuition and Overview ................................................................ 25 3.4. Solution Details ............................................................................ 27 3.4.1. Review Crawler .................................................................... 28 3.4.2. POS Tagger .......................................................................... 29 III 3.4.3. Feature Extractor ................................................................. 30 3.4.4. Vector Generator .................................................................. 34 3.4.5. Movie Recommender ............................................................ 37 3.4.6. User Recommender .............................................................. 38 3.5. Experiment and Result ................................................................. 40 3.5.1. Evaluation Metrics ............................................................... 40 3.5.2. Experiment Results............................................................... 42 3.6. Conclusion ................................................................................... 54 CHAPTER 4. STUDY ON USING CRITIC REVIEWS TO BOOST NEW ITEM RECOMMENDATION ........................................................... 57 4.1. Introduction .................................................................................. 57 4.2. Related Work ................................................................................ 61 4.2.1. Partially Labeled Dirichlet Allocation ................................ 62 4.2.2. Non-negative Matrix Factorization ..................................... 63 4.3. Intuition and Overview ................................................................ 64 4.4. Solution Details ............................................................................ 66 4.4.1. Crawler ................................................................................ 67 4.4.2. Topic Modeler ...................................................................... 68 4.4.3. Profile Learner ..................................................................... 71 4.4.4. Recommender ....................................................................... 74 4.5. Experiment and Results................................................................ 75 4.5.1. Evaluation Metrics ............................................................... 76 4.5.2. Experiment Setup ................................................................. 77 4.5.3. Experimental Results ........................................................... 78 4.6. Conclusion ................................................................................... 86 CHAPTER 5. STUDY ON FUNCTIONALITY-BASED MOBILE APP RECOMMENDATION BY IDENTIFYING FUNCTIONAL ASPECTS FROM USER REVIEWS .............................................................................. 89 IV 5.1. Introduction .................................................................................. 89 5.2. Related Work ................................................................................ 94 5.2.1. Mobile App Recommendation .............................................. 94 5.2.2. PageRank-Based Methods ................................................... 96 5.3. Intuition and Overview ................................................................ 97 5.4. Solution Details ............................................................................ 99 5.4.1. App Data Crawler .............................................................. 101 5.4.2. Functionality Extractor ...................................................... 101 5.4.3. App Recommender ............................................................. 103 5.5. Experiment and Results.............................................................. 108 5.5.1. Evaluation Metrics ............................................................. 108 5.5.2. Experiment Setup ............................................................... 110 5.5.3. Experiment Results............................................................. 111 5.6. Conclusion ................................................................................. 118 CHAPTER 6. CONCLUSION ................................................................. 120 BIBLIOGRAPHY ........................................................................................ 122 V SUMMARY Recommender Systems (RS) have become increasingly essential in many domains for alleviating the “information overload” problem, but existing recommendation techniques suffer from the sparsity problem due to insufficient input data. In this thesis, we aim at extracting and incorporating meta-data from free- text User-Generated Content (UGC) to lessen the effects of sparsity and therefore improve the quality of recommendation. We achieve this goal by conducting three different studies, each of which proposes a recommendation solution that incorporates UGC from different perspectives, and addresses specific problems introduced by data sparsity in different contexts. In particular, in study one (Chapter 3), we show that adjective features embedded in user reviews are useful for characterizing movie features as well as user tastes. We extend the standard TF-IDF term weighting scheme by introducing Cluster Frequency (CLF) to automatically extract high quality adjective features from user reviews, and incorporate the extracted adjective features into a specific recommendation technique, i.e. Singular Value Decomposition (SVD) to show effectiveness. In study two (Chapter 4), we show that critic reviews of the items can be used to boost new item recommendation. We collect critic review articles for VI corresponding items in recommender system, and employ topic model to quantify the textual content. We adapt Non-negative Matrix Factorization (NMF) to incorporate the topics inferred from the critic reviews for recommendation, aiming at addressing the new item recommendation problem. Study three (Chapter 5) focuses on extracting functional aspects from user reviews for mobile app recommendation. With the extracted functional aspects, we are able to analyze user requirements at the functional level. We propose a graph-based ranking algorithm to predict new functionalities for users, and devise a competition mechanism to filter redundant recommendations. Our proposed solution is effective in improving stability against data sparsity and