Chapter 3 Personalised Tag Suggestion Using Social Context
Total Page:16
File Type:pdf, Size:1020Kb
Open Research Online The Open University’s repository of research publications and other research outputs Exploiting Social Networks for Recommendation in Online Image Sharing Systems Thesis How to cite: Rae, Adam (2012). Exploiting Social Networks for Recommendation in Online Image Sharing Systems. PhD thesis The Open University. For guidance on citations see FAQs. c 2012 Adam Rae Version: Version of Record Link(s) to article on publisher’s website: http://dx.doi.org/doi:10.21954/ou.ro.00007e30 Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online’s data policy on reuse of materials please consult the policies page. oro.open.ac.uk THE OPEN UNIVERSITY Exploiting Social Networks for Recommendation in Online Image Sharing Systems by Adam Rae A thesis submitted for the degree of Doctor of Philosophy August 2011 Contents List of Figures v List of Tables vii List of Publications and Patents ix Acknowledgements xii Abstract xiv Declaration of Authorship xvi Note Regarding URLs xvii 1 Introduction 1 1.1 Motivation ........................................ 3 1.1.1 The growth of online media sharing .................... 6 1.1.2 The importance of recommender systems ................. 6 1.1.3 The weaknesses of current approaches ................... 7 1.1.4 The potential of social context awareness ................. 7 1.2 Hypothesis ........................................ 8 1.2.1 Main hypothesis ................................ 8 1.2.2 Sub-questions and breakdown ........................ 9 1.3 Contributions ...................................... 11 1.3.1 Suggesting tags for photos using personalised data and social graphs . 11 1.3.2 Predicting Flickr Favourites using social context information ..... 11 1.4 Thesis organisation ................................... 12 2 Social Context Use in Digital Image Recommendation Systems 15 2.1 Photography from paper to pixels .......................... 16 2.2 Recommender and suggestion systems ....................... 21 2.2.1 Motivation ................................... 21 2.2.2 Content-based recommendation ...................... 25 2.2.3 Collaborative recommendation ....................... 27 2.2.4 Hybrid recommendation ........................... 30 i ii CONTENTS 2.2.5 Common problems in recommendation .................. 30 2.2.6 New approach proposal ............................ 31 2.3 Social context data ................................... 33 2.3.1 Overview of graphs and social graphs in particular ............ 33 2.3.2 Social networks online ............................ 33 2.3.3 Analysing groups ................................ 36 2.3.4 Social context data ............................... 38 2.4 Text-based feature extraction ............................. 38 2.4.1 Common forms of textual metadata .................... 38 2.4.2 Why use metadata? .............................. 41 2.4.3 Tag handling techniques ........................... 44 2.4.4 Folksonomies .................................. 44 2.4.5 Tag recommendation ............................. 44 2.5 Feature extraction for image content-based IR .................. 46 2.5.1 Details of existing features .......................... 48 Texture ................................. 49 Colour .................................. 50 Salient points .............................. 51 Other image features ......................... 52 2.6 Evaluation datasets for socially-shared multimedia ................ 53 2.6.1 Introduction .................................. 53 Manually-created ........................... 53 Community-generated ........................ 54 2.6.2 Criteria for high-quality social media datasets .............. 55 1. Realistic sample of environment ................. 55 2. Rich media, photos and/or video ................ 56 3. Contains rich social data ..................... 56 4. Good size .............................. 56 5. Availability for repetition ..................... 57 2.6.3 The need for a social media datasets .................... 57 2.6.4 Existing social media and recommendation datasets ........... 58 2.6.4.1 Social media datasets ....................... 58 2.6.4.2 Recommendation datasets .................... 60 2.6.5 Dataset evaluation summary ......................... 63 2.7 Reflection on the state of the art ........................... 65 3 Personalised Tag Suggestion Using Social Context 67 3.1 Motivation ........................................ 68 3.2 Tag suggestion ...................................... 70 3.2.1 Problem specification ............................. 70 3.2.2 Social features ................................. 71 3.2.3 Tag suggestion using social context ..................... 71 3.3 A probabilistic approach using social graphs .................... 72 3.3.1 Probabilistic prediction framework ..................... 72 3.3.2 Personal Context (PC) ............................ 75 3.3.3 Social Contact Context (SCC) ........................ 75 3.3.4 Social Group Context (SGC) ......................... 76 CONTENTS iii 3.3.5 Collective Context (CC) ........................... 76 3.3.6 Tag co-occurrence multigraph definitions ................. 77 3.3.7 Aggregation methods ............................. 81 3.3.8 Data processing ................................ 85 3.3.9 Experiment strategy .............................. 90 3.4 Experiment design ................................... 91 3.4.1 Task ....................................... 91 3.4.2 Input tag selection ............................... 92 3.4.3 Evaluation considerations .......................... 94 3.5 First stage: feasibility study .............................. 96 3.5.1 Dataset and users ............................... 96 3.5.2 Evaluation of results .............................. 98 3.5.2.1 Performance of individual Contexts .............. 99 3.5.2.2 Performance of combined Contexts .............. 102 3.5.3 Feasibility study evaluation .......................... 103 3.6 Second stage: experiment ............................... 104 3.6.1 Data collection ................................. 104 3.6.2 Results ...................................... 106 3.6.2.1 Performance of individual Contexts .............. 106 3.6.2.2 Performance of combined Contexts .............. 109 3.7 Conclusions ....................................... 111 3.8 Reflection on tag suggestion experiments ...................... 112 4 Identifying Flickr Favourites Using Social Context 115 4.1 Introduction ....................................... 116 4.1.1 Problem specification ............................. 119 4.1.2 Predicting Favourite photos ......................... 121 4.2 Experiment ........................................ 122 4.2.1 The multi-modal feature space ....................... 122 4.2.2 Supervised learning for classification .................... 122 4.2.3 Evaluation considerations .......................... 126 4.2.4 Use-case scenarios ............................... 130 4.2.5 Datasets ..................................... 131 4.2.6 Social features ................................. 134 4.2.7 Textual features ................................. 136 4.2.8 Visual features ................................. 137 4.2.8.1 Geometry .............................. 138 4.2.8.2 Contrast ............................... 139 4.2.8.3 Saturation, brightness, sharpness and colourfulness ..... 139 4.2.8.4 Naturalness ............................. 141 4.2.8.5 Texture ................................ 142 4.2.9 Implementation ................................ 142 Training gradient boosted decision tree. .............. 143 Experimental runs ........................... 143 4.3 General classifier evaluation .............................. 144 4.3.1 Overall performance .............................. 144 4.3.1.1 Results for user with 100+ Favourites .............. 145 iv CONTENTS 4.3.1.2 Results for user with 50-99 Favourites ............. 157 4.3.1.3 Results for user with 10-49 Favourites ............. 158 4.3.1.4 Results for user with 5-9 Favourites ............... 158 4.3.2 Summary of general classifier approach .................. 158 4.3.3 Personalising classifiers ............................ 159 4.4 Individual classifier evaluation ............................ 159 4.4.1 Across metrics ................................. 159 4.4.2 Across user sets ................................. 161 4.4.3 Between scenarios ............................... 162 4.4.4 Comparison with general classifier approach ............... 162 4.4.5 Summary of individually trained model findings ............. 164 4.4.6 The value of personalising classifiers .................... 164 4.5 Conclusions ....................................... 165 4.6 Reflection on Flickr Favourites ............................ 168 5 Conclusions and Future Direction 171 5.1 Hypothesis evaluation ................................. 172 5.2 Limitations and future work .............................. 176 Appendix 181 List of Figures 2.1 Example of Apple QuickTake 100 digital camera ................. 18 2.2 A number of websites provide photo sharing facilities, with varying services and functionalities. Logos are the property of their respective owners. .... 19 2.3 The set of ratings R for all users U for all items I ................. 24 2.4 Using vectors of item feature values as a basis for recommendation. ...... 25 2.5 Using vectors that encode user ratings as a basis for recommendation. .... 27 2.6 Proposed extension to existing recommender paradigms that includes social context