The Subject Indexing Process: an Investigation of Problems in Knowledge Representation

The Subject Indexing Process: an investigation of problems in knowledge representation Jens-Erik Mai Graduate School of Library and Information Science The University of Texas at Austin May 2000 The Subject Indexing Process: an investigation of problems in knowledge representation by Jens-Erik Mai, M.L.I.S. Dissertation Presented to the Faculty of the Graduate School of Library and Information Science the University of Texas at Austin in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy The University of Texas at Austin May 2000 ii The Subject Indexing Process: an investigation of problems in knowledge representation Approved by Dissertation Committee: Francis L. Miksa, Supervisor Philip Doty E. Glynn Harmon Aloysius P. Martinich Ruth A. Palmquist iii Dedicated to Mette without whom this would have been impossible. iv Acknowledgements I wish to thank a number of people and organizations that have been involved in the completion of this dissertation. First and foremost I wish to express my sincere appreciation to Fran Miksa, my supervisor, for his support, encouragement, and belief in this dissertation. My committee has supported and given me excellent feedback during this journey. My thanks therefore go to Phil Doty for sharing his extraordinary breath of view, Glynn Harmon for always bringing a larger perspective to things, Al Martinich for making me understand the basics of philosophy of language, and Ruth Palmquist for always asking the critical questions. Special thanks also go to my friends and colleges at the Royal School of Library and Information Science, especially to Hanne Albrechtsen, Pia Borlund, Birger Hjørland, and Peter Ingwersen. My good friend Ole Bech-Petersen has improved the readability of the dissertation greatly, for that he deserves a huge thanks. A number of organizations, schools, and foundations have supported me economically during my doctoral studies. I am grateful to all of them: the v University of Texas at Austin, Graduate School of Library and Information Science; the Royal School of Library and Information Science; the Fulbright Commission; The Danish Research Academy, Aarhus; The Denmark-America Foundation, Copenhagen; Knud Højgaards Fond, Charlottenlund; and Det Obelske Familiefond, Aalborg. vi The Subject Indexing Process: an investigation of problems in knowledge representation Publication No._____________ Jens-Erik Mai, Ph.D. The University of Texas at Austin, 2000 Supervisor: Francis Miksa The present dissertation is an attempt to explain at least some of the major problems related to subject indexing, that is, the subject representation of documents. In the process of doing so a new approach to the subject indexing process is proposed. This dissertation enters the discussion of how the subject matter of a document is determined from the standpoint that subjects are culturally and socially dependent. Common approaches to the study of indexing have been to study how indexers index or to prescribe how indexers should index, with the goal of describing the rules of indexing or prescribing a universal and objective way which indexing should be carried out. Here it is argued that it is not possible to vii prescribe how indexing should be done and that there are no universal and correct ways of indexing which can be applied to any given situation. Studies of how indexing is done must be based on an understanding of how one arrives at the meaning of a given set of human communicative symbols. The subject indexing process is ordinarily described as a process that takes a number of steps. Here, however, it is argued that this typical approach characteristically lacks an understanding of what the central nature of the process is. Indexing is not a neutral and objective representation of a document’s subject matter but the representation of an interpretation of a document for future use. Semiotics, based on key ideas found in Charles Sanders Peirce’s writings, is offered here as a framework for understanding the “interpretive” nature of the subject indexing process. Peirce’s semiotics offers an especially valuable terminology for describing the different kinds of interpretation that takes place throughout the subject indexing process. By placing the subject indexing process within Peirce’s semiotic framework of ideas and terminology, a more detailed description of the process is offered here than has been previously possible. While this attempt to show what occurs in the subject indexing process offers a more realistic view of what actually happens in the process, it also suggests strongly that there is no certainty to the result of the subject indexing process. viii Table of Contents List of Figures................................................................................................... xi 1. Introduction................................................................................................... 1 1.1. To See and to Interpret...................................................................... 9 1.2. Structure of the Dissertation ........................................................... 15 2. The Subject Indexing Process..................................................................... 18 2.1. Determining the Subject of a Document ........................................ 20 2.2. Document and Subject Analysis..................................................... 49 2.3. Stages of Indexer Experience ......................................................... 66 2.4. An Interpretive Approach ............................................................... 73 2.5. Summary and Conclusion............................................................... 92 3. Representation ............................................................................................ 94 3.1. Determining the Subject Matter...................................................... 97 3.2. Studies on Indexing ...................................................................... 120 3.3. Relevance Studies......................................................................... 171 3.4. Summary and Conclusion............................................................. 181 4. Semiotics .................................................................................................. 197 4.1. Definition of Semiotics................................................................. 200 4.2. C. S. Peirce: A Biographical Sketch............................................. 202 4.3. The Sign........................................................................................ 204 4.4. Unlimited Semiosis....................................................................... 211 ix 4.5. Phenomenology ............................................................................ 214 4.6. Categories of Signs....................................................................... 219 4.7. Summary and Conclusion............................................................. 249 5. Semiotic Analysis of the Subject Indexing Process.................................. 251 5.1. The Subject Indexing Process as Unlimited Semiosis.................. 253 5.2. The Steps of the Subject Indexing Process................................... 260 5.3. Peirce’s Categories of Signs and the Four Elements .................... 275 5.4. Summary and Conclusion............................................................. 300 6. Summary and Conclusions ....................................................................... 303 6.1. Approaches to Indexing................................................................ 306 6.2. Indexing as Interpretation ............................................................. 309 6.3. Semiotics and Indexing................................................................. 312 6.4. Later Study of Indexing ................................................................ 315 6.5. Summary....................................................................................... 318 Bibliography .................................................................................................. 319 Vita .............................................................................................................. 345 x List of Figures Fig. 1-1. The Necker Cube.............................................................................. 10 Fig. 1-2. The Jastrow Duck-Rabbit................................................................. 11 Fig. 2-1. The Scope-Matching Process........................................................... 56 Fig. 2-2. The Subject Indexing Process .......................................................... 62 Fig. 3-1. Conceptions of Subject Analysis and Indexing.............................. 130 Fig. 3-2. Variables in Relevance Judgments................................................. 172 Fig. 3-3. Definitions of Relevance................................................................ 173 Fig. 3-4. Aspects of the Five Conceptions of Indexing ................................ 189 Fig. 3-5. Five Conceptions of Indexing ........................................................ 190 Fig. 4-1. The Semiotic Triangle.................................................................... 205 Fig. 4-2. The Y-leg Model............................................................................ 208 Fig. 4-3. Unlimited Semiosis........................................................................ 212 Fig. 4-4. Triadic Classification of Signs......................................................

The Subject Indexing Process: an Investigation of Problems in Knowledge Representation

Glossary and Bibliography for Vocabularies 1 the Codes (For Example, the Dewey Decimal System Number 735.942)

Catalogue and Index

The Advantages and Disadvantages of Social Tagging: Evaluation of Delicious Website1

Taxonomy Directed Folksonomies

Unit 11 Indexing Techniques

Guidelines for Subject Access in National Bibliographies Draft 2011, May

Controlled Vocabularies and Tags: an Analysis of Research Methods

Chapter K. Controlled Vocabulary (CV)

The Exploitation of Social Tagging in Libraries

The Role of Social Tags in Web Resource Discovery: an Evaluation of User-Generated Keywords

A Utilization Model of Users' Metadata in Libraries

How Semantic Tagging Increases Findability