Bridging the Semantic Gap: Exploring Descriptive Vocabulary for Image Structure

BRIDGING THE SEMANTIC GAP: EXPLORING DESCRIPTIVE VOCABULARY FOR IMAGE STRUCTURE Caroline Beebe Submitted to the faculty of the University Graduate School in partial fulfi llment of the requirements for the degree Doctor of Philosophy in the School of Library and Information Science, Indiana University August 2006 Accepted by the Graduate Faculty, Indiana University, in partial fulfi llment of the requirements for the degree of Doctor of Philosophy. Doctoral Committee Elin K. Jacob, Ph.D. Alice Robbin, Ph.D. Katy Borner, Ph.D. K. Anne Pyburn, Ph.D. Date of Oral Examination August 4, 2006 ii © 2006 Caroline Beebe ALL RIGHTS RESERVED iii Acknowledgements A dissertation may be authored by one individual, but the process requires an immense amount of help and support. Many, many thanks are owed to the following people: Dr. Elin K. Jacob, my advisor and committee chair, this task could not have been accomplished without your guidance, your insight, and your painstaking work training me to be a better writer. Dr. Jacob deserves credit for coining phrases descriptive of this work: Internal contextuality, semantic coherence, semantic integrity, semantic organizers, and the shared-ness measure. Dr. Alice Robbin and Dr. Katy Borner, my SLIS committee members, you were each extremely enthusiastic, patient and supportive of my work. Dr.Anne Pyburn, my Anthropology committee member, your faith in my abilities was more than I often had. Dr. Howard Levie, my mentor in Instructional Systems Technology, who encouraged me in early theory building, and who passed on way too soon. Dick Yoakam, School of Journalism, mentored me through the qualifying paper and is sorely missed. Violet Wyld, the family cat, was my companion through the hardest and loneliest part of proposal creation, with her attempts at keyboard support, and her constant meows of agreement. Genevieve Pritchard, my daughter, cheerleader, and support for data entry, you were ever ready with a can-do attitude and late night calculating. Crystal Yoakam, my daughter, you were ever ready with the best consoling hugs; you stepped in to fi nd Christian Kelly to format this document, and with your help this will get published. Denise Schmutte, my statistics coach, who took the time to really understand my data and created the concept of a shared-ness rating. Scott Schroeder, MBA student and Excel afi cionado, took mere hours to create formulas that would have taken me days. iv Cathy Spiaggia, my friend, and Tamara Morrison, my cousin, both of whom constantly encouraged me, over all these long years ... it’s fi nally done! Bud Beebe, my Dad, and Mary BobYoakam, my mother-in-law, both would have been so proud. Jane Beebe, my Mom, whose love and support has meant so much through the grueling fi nish. Michael Yoakam, my husband, who stood by me through this seemingly endless process and to his credit did not let this exercise destroy our marriage; you have seen the worst of me and the best of me, you are the love of my life. v Preface A picture is worth a thousand words: Which ones? Describing pictures with words is like building trees with lumber. “The study of form may be descriptive merely, or it may become analytical. We begin by describing the shape of an object in the simple words of common speech: we end by describing it in the precise language of mathematics; and the one method tends to follow the other in strict scientifi c order and historical continuity ... The mathematical defi nition of form has a quality of precision quite lacking in our earlier stage of mere description ... We are brought by means of it into touch with Galileo’s aphorism (as old as Plato, as old as Pythagoras, as old perhaps as the wisdom of the Egyptians) that the book of nature is written in characters of geometry.” (Sir d’Arcy Wentworth Thompson, “On Growth and Form,” 1917, p. 719) vi Caroline Beebe BRIDGING THE SEMANTIC GAP: EXPLORING DESCRIPTIVE VOCABULARY FOR IMAGE STRUCTURE Content-Based Image Retrieval (CBIR) is a technology made possible by the binary nature of the computer. Although CBIR is used for the representation and retrieval of digital images, these systems make no attempt either to establish a basis for similarity judgments generated by query-by-pictorial-example searches or to address the connection between image content and its internal spatial composition. The disconnect between physical data (the binary code of the computer) and its conceptual interpretation (the intellectual code of the searcher) is known as the semantic gap. A descriptive vocabulary capable of representing the internal visual structure of images has the potential to bridge this gap by connecting physical data with its conceptual interpretation. The research project addressed three questions: Is there a shared vocabulary of terms used by subjects to represent the internal contextuality (i.e., composition) of images? Can the natural language terms be organized into concepts? And, if there is a vocabulary of concepts, is it shared across subject pairs? A natural language vocabulary was identifi ed on the basis of term occurrence in oral descriptions provided by 21 pairs of subjects participating in a referential communication task. In this experiment, each subject pair generated oral descriptions for 14 of 182 images drawn from the domains of abstract art, satellite imagery and photo-microscopy. Analysis of the natural language vocabulary identifi ed a set of 1,319 unique terms which were collapsed into 545 concepts. These terms and concepts were organized into a faceted vocabulary. This faceted vocabulary can contribute to the development of more effective image retrieval metrics and interfaces to minimize the terminological confusion and conceptual overlap that currently exists in most CBIR systems. For both the user and the system, the concepts in the faceted vocabulary can be used to represent shapes and relationships between shapes (i.e., internal contextuality) that constitute the internal spatial composition of an image. Representation of internal contextuality would contribute to more effective image search and retrieval by facilitating the construction of more precise feature queries by the user as well as the selection of criteria for similarity judgments in CBIR applications. vii Table of Contents Chapter 1: Introduction to the Research . .1 1.1 Introduction to the problem . .1 1.2 The problem of words and pictures . .8 1.3 Representation . .10 1.4 Defi nitions . .16 1.5 Contribution to theory . .20 Chapter 2: Literature Review . .21 2.1 Information retrieval systems . .21 2.2 Content-Based Image Retrieval (CBIR). .23 2.3 Visual language. .34 2.4 Image properties . .38 2.5 Indexing languages. .44 2.6 Typologies. .50 2.7 Image perception . .54 2.8 Form and function . .59 2.9 Relevant research . .65 2.9.1 Queries and searchers . .65 2.9.2 User studies in CBIR . .67 2.9.3 Image description . .70 2.10 Closing the semantic gap . .74 2.11 Conclusion . .78 Chapter 3: Questions and Hypothesis. .80 Chapter 4: Methodology . .84 4.1 Data collection . .84 4.1.1 Design . .84 4.1.2 Materials . .85 4.1.3 Subjects . .89 4.1.4 Procedure. .89 4.2 Data processing. .90 4.2.1 Transcribing oral descriptions . .91 4.2.2 Building the controlled vocabulary . .93 4.2.3 Word list. .95 4.2.4 Stop words . .97 4.2.5 Syntactic normalization . .100 4.2.6 Faceting and semantic normalization . .102 4.2.7 Finalizing the term list and concept hierarchy. .106 viii Chapter 5: Results. .109 5.1 Term frequencies . .110 5.1.1 Terms with high frequencies and pair count . .110 5.1.2 Image domain counts. .112 5.1.3 Lowest occurrence terms. .123 5.1.4 Parts of speech. .124 5.1.5 Comparatives and superlatives . .124 5.2 Collapsing to concepts . .126 5.2.1Concept frequencies and counts . .127 5.2.2 Image domain concept counts . .131 5.2.3 Concepts in the faceted scheme . .131 5.3 Shared concepts . .140 Chapter 6: Discussion. .155 6.1 The descriptive process. .155 6.2 Terms . .158 6.3 Concepts . .160 6.3.1 Faceted scheme . .160 6.3.2 Shared-ness: Concept frequencies and counts. .164 6.4 Limitations . .167 Chapter 7: Future Work. .169 7.1 Focus on descriptions and subject characteristics . .169 7.2 Develop data processing techniques . .170 7.3 Deeper data analysis . .171 7.4 Applications of the controlled vocabulary. .172 7.4.1 Overlap in CBIR properties . ..

Load more