Adaptive Retrieval, Composition & Presentation of Closed-Corpus And
Total Page:16
File Type:pdf, Size:1020Kb
Adaptive Retrieval, Composition & Presentation of Closed-Corpus and Open-Corpus Information A thesis submitted to the University of Dublin, Trinity College for the degree of Doctor of Philosophy Ben Steichen Knowledge and Data Engineering Group Intelligent Systems School of Computer Science and Statistics Trinity College Dublin Ireland 2012 Declaration I declare that this thesis has not been submitted as an exercise for a degree at this or any other university and it is entirely my own work. I agree to deposit this thesis in the University’s open access institutional repository or allow the library to do so on my behalf, subject to Irish Copyright Legislation and Trinity College Library conditions of use and acknowledgement. Ben Steichen March 2012 i Acknowledgements I would like to deeply thank my supervisor Prof. Vincent Wade, whose expert knowledge, guidance and dedication has mentored me throughout this work. I would also like to express my gratitude to Prof. Helen Ashman for her insightful feedback. A huge thank you to all my colleagues and friends in the Knowledge and Data Engineering Group (KDEG), many of whom I had the chance to collaborate with over the course of this research. I would also like to extend my appreciation to all members of the Centre for Next Generation Localisation (CNGL). In particular, I would like to express my gratitude to Fred Hollowood, Johann Roturier and Jason Rickard from Symantec for all their contributions and help during this research. Most importantly, I would like to thank Bo and my parents for their unconditional belief, support and encouragement throughout the years. ii Abstract A key challenge for information access systems lies in their ability to deliver information that is most suited to a user’s needs, preferences and context. Personalised Information Retrieval (PIR) seeks to address this challenge by tailoring the selection of results to each individual user. Such PIR systems typically generate adaptive result rankings based on historic user interests or location properties. However, other considerations such as user needs, preferences or context are often neglected. Moreover, users are typically only presented with linear (monolingual) result rankings that do not provide any adaptive navigation support across different information sources. On the other hand, the field of Adaptive Hypermedia (AH) has inherently focused on generating non-linear, hyperlinked result compositions. This enables adaptive navigation and presentation support, allowing users a guided experience through an information space. Moreover, AH systems typically generate adaptive responses according to multiple considerations (also called personalisation “dimensions”), such as user needs, knowledge and context. However, AH techniques have typically only been applied across closed-corpus content bases, requiring substantial amounts of metadata. The key problem remains in providing such adaptive compositions across open-corpus information sources (in addition to closed corpora). In order to address this problem, the thesis presents a novel compositional approach to open- and closed-corpus information retrieval and delivery through an innovative combination of Adaptive Hypermedia and Personalised Information Retrieval techniques. This technology enables the first dynamic integration and multidimensional adaptation of multilingual open and closed corpora. In particular, the contribution of the thesis is an extension of PIR and AH techniques to enable informed multiple adaptive query generation and adaptive result recomposition and presentation. This innovation is evaluated and validated through a series of case study implementations and evaluations, which show that the compositional approach successfully supports authentic user information needs in a personalised manner. In particular, it is shown that users are more efficient, effective and satisfied with the compositional approach compared to conventional information retrieval systems. Moreover, the approach is shown to be able to support multiple dimensions of adaptation, including user intent, language, knowledge, interface preferences and device capabilities. iii Table of Contents Declaration........................................................................................................................ i Acknowledgements .......................................................................................................... ii Abstract........................................................................................................................... iii Table of Contents ............................................................................................................ iv Table of Figures ............................................................................................................. vii Table of Tables ................................................................................................................ ix 1 Introduction ................................................................................................................ 1 1.1. Motivation .......................................................................................................... 1 1.2. Research Question .............................................................................................. 4 1.3. Objectives ........................................................................................................... 5 1.4. Methodology ...................................................................................................... 5 1.5. Contribution ....................................................................................................... 6 1.6. Thesis Overview ................................................................................................ 8 2 Adaptive Hypermedia & Personalised Information Retrieval ................................. 10 2.1. Introduction ...................................................................................................... 10 2.2. Query adaptation .............................................................................................. 13 2.2.1. Summary and Critique .............................................................................. 18 2.2.2. Comparison across Query Adaptation techniques .................................... 21 2.3. Retrieval adaptation ......................................................................................... 23 2.3.1. Statistical methods .................................................................................... 24 2.3.2. Metadata-based approaches ...................................................................... 26 2.3.3. Summary and Critique .............................................................................. 30 2.3.4. Comparison across Retrieval Adaptation techniques ................................ 32 2.4. Adaptive Composition & Presentation ............................................................ 35 2.4.1. Statistical techniques ................................................................................. 35 2.4.2. Metadata-based techniques ....................................................................... 38 2.4.3. Summary and Critique .............................................................................. 43 2.4.4. Comparison across Adaptive Composition & Presentation techniques .... 44 2.5. Conclusions ...................................................................................................... 47 2.5.1. User dimensions ........................................................................................ 47 2.5.2. Adaptation techniques ............................................................................... 49 2.5.3. Overall Findings and Complementary Affordances ................................. 51 3 Initial Adaptive Open-Corpus Composition System ............................................... 54 3.1. Introduction ...................................................................................................... 54 3.2. Contribution of the author ................................................................................ 55 3.3. Architecture ...................................................................................................... 56 3.3.1. Models ....................................................................................................... 56 3.3.2. Architecture Components & Capabilities ................................................. 57 3.3.3. Technological Architecture ....................................................................... 59 3.4. Prototype Implementation ................................................................................ 61 3.4.1. Prototype Prerequisites ............................................................................. 61 3.4.2. Adaptation Process .................................................................................... 64 3.5. Evaluation ........................................................................................................ 68 3.5.1. Educational Benefit and User Satisfaction Hypotheses ............................ 68 3.5.2. Comparison to baselines ........................................................................... 69 3.5.3. Experimental Setup ................................................................................... 71 3.5.4. Results ......................................................................................................