Content Recommendation Through Linked Data

Politecnico di Torino Porto Institutional Repository [Doctoral thesis] Content Recommendation Through Linked Data Original Citation: Vagliano, Iacopo (2017). Content Recommendation Through Linked Data. PhD thesis Availability: This version is available at : http://porto.polito.it/2670692/ since: May 2017 Published version: DOI:10.6092/polito/porto/2670692 Terms of use: This article is made available under terms and conditions applicable to Open Access Policy Article ("Creative Commons: Attribution-Noncommercial-Share Alike 3.0") , as described at http://porto. polito.it/terms_and_conditions.html Porto, the institutional repository of the Politecnico di Torino, is provided by the University Library and the IT-Services. The aim is to enable open access to all the world. Please share with us how this access beneﬁts you. Your story matters. (Article begins on next page) Doctoral Dissertation Doctoral Program in Computer and Control Engineering (29thcycle) Content Recommendation Through Linked Data By Iacopo Vagliano ****** Supervisor(s): Prof. Maurizio Morisio Doctoral Examination Committee: Dr. Fabien Gandon, Referee, Inria, CNRS, I3S Prof. Andrea G. B. Tettamanzi, Referee, Université Côte d’Azur, Inria, CNRS, I3S Prof. Fulvio Corno, Politecnico di Torino Prof. Paweł Czarnul, Gdansk University of Technology Prof. Marco Torchiano, Politecnico di Torino Politecnico di Torino 2017 Declaration I hereby declare that, the contents and organization of this dissertation constitute my own original work and does not compromise in any way the rights of third parties, including those relating to the security of personal data. Iacopo Vagliano 2017 This dissertation is presented in partial fulfillment of the requirements for Ph.D. degree in the Graduate School of Politecnico di Torino (ScuDo). This dissertation is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Visit http://creativecommons.org/licenses/by-nc-sa/4.0/ to view a copy of this license. To my dad “It’s not information overload. It’s filter failure.” Clay Shirky Acknowledgements With this thesis, a season of my life is reaching the end. I expected science to be about answers, and I discovered it is much more about questions. Actually, the more we learn, the more we understand how much we still need to explore; and if we exactly knew what we were doing, we would not call it research. Firstly, I would like to thank my supervisor, Maurizio Morisio. I appreciate all the trust and freedom that I was granted in conducting my research, the valuable discussions and advices, and the chance to deal with situations which I though beyond my capabilities at the time. I am also grateful for the means provided by TIM (formerly Telecom Italia) to support my research. In particular, thanks to my second supervisor, Marco Marengo, your open mind, pragmatism, and interest in a real collaboration between industry and academia were rich stimuli for me. I would like to express sincere gratitude to all the colleagues of the SoftEng group. First of all, thanks to Cristhian Figueroa and Oscar Rodriguez for the fruitful collaboration, which motivated me a lot in these years. I am also grateful to Marco Torchiano for his availability and for sharing his experience. I also appreciated the precious help in technical subjects and the hints to deal with the bureaucratic issues received by Luca Ardito. Indeed, I would like to mention all the mates who made the atmosphere in the Lab 1 more pleasant: Rifat Rashid, Erion Cano, Riccardo Coppola, Diego Monti, Francesco Strada, Amirhosein Toosi and Alysson Dos Santos. Of course, it was also my pleasure to collaborate (and play soccer table in the breaks) with the other present and past members of JOL MobiLab group. So, thanks to Lucia Longo, who was responsible of the TIM support in the final period of my Ph.D. program, Eleonora Gargiulo, Gianluca Cecchi, Mirko Rinaldini, Alessandro Izzo, Luisa Rocca, Enrico Catalano, and all the other interns and undergraduates who joined the group for a while in these years. v An outstanding experience during these years was the research visiting at the Gdansk University of Technology in Poland. I would like to thank all the colleagues of the Knowledge Management group, in particular, Krzysztof Goczyła, which supervised me, Wojciech Waloszek for the useful discussions, and Aleksandra Karpus, for the results reached through our joint effort. I am grateful also to Andrzej Wardzinski,´ Aleksander Jarze¸bowicz and Jakub Miler for guesting me in their office. A special thank to the “train guys”: Mattia Berardo, Marco Basso, Piergianni Serra, Paolo Arnolfo, and, occasionally, Marco Fanti, who have been my mates in the everyday Cuneo-Torino travels (in the good and in the bad) for two years. We have found a nice trade-off between funny and serious chat and, for this reason, we have been hated by all those who were just wishing to sleep in the early morning. On the contrary, Marco Conoscenti suffered me for about one year as a flatmate at “Cumiana, 44”. He found time for suggesting me intellectual books, having ambitious discussions about everything and nothing, and above all, offering me delicious Sicilian food. Finally, I cannot forget my strong foundation: my family and Zosia. Thanks for your unavoidable support. Zosia, our path together started at the same time than the Ph.D. program, but in you, I discovered something much more precious than any qualification or scientific discovery. Abstract Nowadays, people can easily obtain a huge amount of information from the Web, but often they have no criteria to discern it. This issue is known as information overload. Recommender systems are software tools to suggest interesting items to users and can help them to deal with a vast amount of information. Linked Data is a set of best practices to publish data on the Web, and it is the basis of the Web of Data, an interconnected global dataspace. This thesis discusses how to discover information useful for the user from the vast amount of structured data, and notably Linked Data available on the Web. The work addresses this issue by considering three research questions: how to exploit existing relationships between resources published on the Web to provide recommendations to users; how to represent the user and his context to generate better recommendations for the current situation; and how to effectively visualize the recommended resources and their relationships. To address the first question, the thesis proposes a new algorithm based onLinked Data which exploits existing relationships between resources to recommend related resources. The algorithm was integrated into a framework to deploy and evaluate Linked Data based recommendation algorithms. In fact, a related problem is how to compare them and how to evaluate their performance when applied to a given dataset. The user evaluation showed that our algorithm improves the rate of new recommendations, while maintaining a satisfying prediction accuracy. To represent the user and their context, this thesis presents the Recommender System Context ontology, which is exploited in a new context-aware approach that can be used with existing recommendation algorithms. The evaluation showed that this method can significantly improve the prediction accuracy. As regards the problem of effectively visualizing the recommended resources and their relationships, this thesis proposes a vii visualization framework for DBpedia (the Linked Data version of Wikipedia) and mobile devices, which is designed to be extended to other datasets. In summary, this thesis shows how it is possible to exploit structured data available on the Web to recommend useful resources to users. Linked Data were suc- cessfully exploited in recommender systems. Various proposed approaches were implemented and applied to use cases of Telecom Italia. Contents List of Figures xiii List of Tables xvi 1 Introduction1 2 Background6 2.1 Introduction . .6 2.2 Recommender Systems . .7 2.2.1 The Recommendation Problem . .7 2.2.2 Recommendation Techniques . .9 2.3 The Web of Data . 13 2.3.1 Beyond Data Silos . 14 2.3.2 Technology Stack and Linked Data Principles . 16 3 Linked Data Based Recommender Systems 19 3.1 Introduction . 19 3.2 Research Methodology . 20 3.2.1 Research Questions, Search String, and Sources . 21 3.2.2 Search and Selection . 22 3.2.3 Quality assessment, Data Extraction and Synthesis . 22 Contents ix 3.3 Results . 25 3.3.1 Included Studies . 25 3.3.2 Research Problems . 27 3.3.3 Contributions . 29 3.3.4 Use of Linked Data . 31 3.3.5 Application Domains . 35 3.3.6 Evaluation Techniques . 36 3.3.7 Future Work . 38 3.3.8 Limitations . 40 3.4 Discussion . 40 3.4.1 Specific Research Questions . 41 3.4.2 Limitations of Our Systematic Literature Review . 49 3.5 Conclusions . 49 4 A Framework for Linked Data based Recommendation Algorithms 52 4.1 Introduction . 52 4.2 The Allied Framework . 53 4.3 Implementation . 55 4.3.1 Knowledge Base Core . 55 4.3.2 Generation Layer . 57 4.3.3 Ranking Layer . 61 4.3.4 Classification Layer . 65 4.3.5 Presentation Layer . 67 4.4 Conclusions . 70 5 A Dynamic Recommendation Algorithm Based on Linked Data 71 5.1 Introduction . 71 x Contents 5.2 Related Work . 72 5.3 ReDyAl . 73 5.3.1 Principles . 74 5.3.2 Reducing the Search Space . 75 5.3.3 Parameter Settings . 76 5.3.4 Algorithm . 77 5.3.5 Ranking of the Recommended Resources . 78 5.4 User Evaluation . 79 5.4.1 Experiment . 80 5.4.2 Results . 81 5.5 Applications . 84 5.5.1 Mobile Movie Recommendations . 84 5.5.2 eTourism Platform . 86 5.6 Conclusions and Future Work . 91 6 Leveraging Ontologies for Context-Aware Recommendations 93 6.1 Introduction . 93 6.2 Context-Aware Recommender Systems .

Load more