A Dialectical Model of Information Retrieval: Exploring a Contradiction in Terms

A dialectical model of information retrieval: exploring a contradiction in terms Clare Victoria Thornley Department of Computer and Information Sciences University of Strathclyde 2005 Thesis submitted for degree of Doctor of Philosophy Declaration of authorship The copyright of this thesis belongs to the author under the terms of the United Kingdom Copyright Acts as qualified by the University of Strathclyde Regulation 3.51. Due acknowledgement must always be made of the use of material contained in, or derived from, this thesis. ii Acknowledgements This thesis was funded by the Humanities Research Board at the British Academy. I would like to thank Professor Forbes Gibb and Dr. Ruben Leon for their supervision. Linda Hunter, Zakia Majid and Duncan Sinclair provided much appreciated practical and technical support. The staff at Celbridge local library were also invaluable in their work obtaining essential texts from the British Library. I could not have completed this thesis without the support and encouragement of my husband, Sean Moloney. Many people have helped me with my children. Claire Ryan, Paula Carroll, Michele Whelan and my mother provided wonderful care for Lorna and Hannah. Joanna Swanson provided both great hospitality and childcare in Edinburgh, which made it a pleasure to travel for supervision. My friend Julia Yelloly was also a great help with the final checking of the text. A PhD is convincing evidence that Wittgenstein was right about the collective nature of our intellectual life and that Descarte’s isolated cogito is not actually how thinking happens. Thus, although the ideas in this thesis are mine, many other people created the context that made them possible. iii “The spirit of reconciliation, the Quaker librarian breathed. There can be no reconciliation, Stephen said, if there has not been a sundering.” (Ulysses, James Joyce, chapter 9: Scylla and Charybdis.) iv Preface Information Retrieval (IR) has two main research traditions. These are the empirical or quantitative tradition, which deals primarily with symbol manipulation and matching and does not explicitly concern itself with the problem of meaning, and the cognitive or qualitative tradition, which proposes that the problem of meaning is central to IR. This thesis discusses the basic principles of both these research traditions and concludes that meaning is central to IR and plays an important part in IR research. The empirical tradition often uses mathematical or logical models as a theoretical basis to its work whilst the cognitive tradition tends to use theories from philosophy, psychology or sociology. The theoretical approach taken in this thesis is philosophical rather than mathematical. The nature of the subjective/objective relationship in meaning is discussed in relation to work in the philosophy of meaning and the issues raised in this analysis are then used to investigate the role of meaning in IR. On the basis of this discussion a dialectical model is developed which provides a new analytical tool for investigating IR problems in terms of their relationship to the dialectical nature of meaning. v Abstract Information retrieval (IR) is the process of representing the meaning of documents so that people who want the information they contain can retrieve them. It is, therefore, centrally concerned with information and meaning. It is concerned with them both on a pragmatic level in terms of designing and making IR systems, and on a theoretical level in terms of why and how these systems work and what this could have to do with the nature of meaning and information. This thesis is primarily about the theoretical and philosophical issues in IR. The main question discussed is the extent to which an investigation into the relationship between the subjective and the objective can improve our understanding of how meaning and information operate in IR. My thesis is that this relationship is a dialectical one, the subjective and the objective exist in a mutually antagonistic and dependent relationship, and that this new perspective on its nature can be theoretically useful for IR. Thus I develop a new theoretical perspective, the dialectical model, which is then used to improve conceptual clarity in a number of difficult and intractable IR problems. The aim is not to solve these problems but to provide a clearer insight into their nature The problems explored are: how to develop a theoretical framework for IR; how to represent the meaning of documents; how to evaluate or test these different meaning representation techniques. It has proved very difficult in IR to provide a theoretical understanding of why or how meaning representation techniques work (or don’t) and also to find a way of measuring their performance which is both reliable and valid. I argue that a clearer perspective on these problems and the ways in which they are (and are not) related can be provided by the dialectical model. Thus, when they are seen as part of the dialectical relationship between the subjective and the objective, as manifested in meaning and information, their complex and often contradictory qualities appears less puzzling. Research to resolve these problems has proved problematic because it is often an attempt to synthesise the dialectical relationship between the subjective and the objective. This is both necessary, to an extent, because meaning and information require vi a connection between the subjective and the objective, but also impossible and inherently unstable, because they also require a separation. They are both mutually dependent and in conflict. The main conclusions drawn from this investigation are that philosophical approaches to IR are potentially helpful in improving theoretical clarity but that this has, so far, been hindered by a failure to acknowledge the central role of conflict and paradox in meaning and information. The dialectical model is an improvement on previous work in this area because it provides a way of working with the many contradictory concepts within IR without trying to create an untenable synthesis. This can help provide new insights into IR both on its own terms and also through the ways in which it relates to philosophy. vii Table of contents 1 Introduction................................................................................................................. 1 1.1 Motivations ......................................................................................................... 2 1.2 What this thesis is about ................................................................................... 11 1.3 Method and structure ........................................................................................ 13 1.4 Bibliographic note............................................................................................. 15 2 Meaning in philosophy ............................................................................................. 16 2.1 Meaning as reference ........................................................................................ 19 2.1.1 The nature of the reference object ............................................................ 20 2.1.2 Objectivity in meaning.............................................................................. 24 2.1.3 Subjectivity in meaning ............................................................................ 35 2.2 Meaning as a social practice ............................................................................. 42 2.2.1 Theories of meaning ................................................................................. 43 2.2.2 Language games........................................................................................ 46 2.2.3 Rules and meaning.................................................................................... 47 2.3 Conclusions....................................................................................................... 50 3 The meaning of information science......................................................................... 52 3.1 Defining information science............................................................................ 53 3.2 Information as a science.................................................................................... 62 3.2.1 What is science?........................................................................................ 63 3.2.2 Is information science a science?.............................................................. 66 3.3 What is information?......................................................................................... 77 3.3.1 Change ...................................................................................................... 79 3.3.2 Knowledge ................................................................................................ 83 3.4 Conclusions....................................................................................................... 87 4 Meaning in information science................................................................................ 89 4.1 The subjective/objective divide ........................................................................ 90 4.2 Objective approaches to information................................................................ 93 viii 4.3 Subjective approaches to information............................................................... 98 4.4 Information as a social practice ...................................................................... 104

A Dialectical Model of Information Retrieval: Exploring a Contradiction in Terms

Basic Concepts in Information Retrieval

The Seven Ages of Information Retrieval

Information Retrieval (Text Categorization)

Information Retrieval System: Concept and Scope MODULE - 5B INFORMATION RETRIEVAL SYSTEM

A Philosophical Wish List for Research in Music Information Retrieval Cynthia M

An Architecture for Information Retrieval Agents* Craig A

Semantic Information Retrieval Based on Wikipedia Taxonomy

Introduction to Information Retrieval

Linguistic Fingerprints of Internet Censorship: the Case of Sina Weibo

User Centered and Ontology Based Information Retrieval System for Life Sciences

The World Trade Law of Censorship and Internet Filtering

A Review of Ontology-Based Information Retrieval Techniques on Generic Domains