User-Generated Metadata in Social Software: an Analysis of Findability in Content Tagging and Recommender Systems
Total Page:16
File Type:pdf, Size:1020Kb
Barnes – i Presented to the Interdisciplinary Studies Program: Applied Information Management and the Graduate School of the University of Oregon in partial fulfillment of the requirement for the degree of Master of Science User-Generated Metadata in Social CAPSTONE REPORT Software: An Analysis of Findability in Content Tagging and Recommender Systems Michael Barnes Manager, User Experience Design University of Oregon VeriSign, Inc. Applied Information Management Program March, 2007 722 SW Second Avenue Suite 230 Portland, OR 97204 (800) 824-2714 Barnes ii - Barnes – iii Approved by __________________________________________ Dr. Linda F. Ettinger Academic Director, AIM Program Barnes iv - Barnes v - Abstract for User-Generated Metadata in Social Software: An Analysis of Findability in Content Tagging and Recommender Systems This study describes how user-generated metadata may be leveraged to enhance findability in web-based social software applications (Morville, 2005). Two interaction design systems, content tagging (Golder & Huberman, 2005) and recommender systems (Resnick & Varian, 1997), are examined to identify strengths and weaknesses along three findability factors: information classification, information retrieval and information discovery. Greater overall findability strength may be found in content tagging systems than in recommender systems. Barnes vi - Barnes vii - TABLE OF CONTENTS FIGURES AND TABLES........................................................................................................ vii CHAPTER I. PURPOSE OF THE STUDY...............................................................................8 BRIEF PURPOSE ................................................................................................................8 FULL PURPOSE ...............................................................................................................11 SIGNIFICANCE OF THE STUDY..........................................................................................20 LIMITATIONS TO THE RESEARCH .....................................................................................23 PROBLEM AREA .............................................................................................................26 CHAPTER II. REVIEW OF REFERENCES...........................................................................30 CHAPTER III. METHOD.......................................................................................................48 CHAPTER IV. ANALYSIS OF DATA...................................................................................57 CHAPTER V. CONCLUSIONS .............................................................................................82 APPENDIX A: DEFINITION OF TERMS ...............................................................................89 APPENDIX B: SOURCES SELECTED FOR CONTENT ANALYSIS....................................98 APPENDIX C: KEY PASSAGES REFERENCED.................................................................100 REFERENCES .......................................................................................................................126 FIGURES AND TABLES Figures Figure 1 – Key words used to guide the literature search process……………………………… 50 Figure 2 – Sample system-factor table for the Analysis of Data Section………...…………….. 55 Figure 3 – Template for outcome presentation…………………………………………………. 56 Figure 4 – System-factor codes and combinations for content analysis coding………………... 58 Tables Table 1 – Distribution of Systems and Factors found in content analysis sources…………….. 60 Table 2 – Number of sources for each system-factor combination……………………………... 61 Table 3 – Strength and weakness elements identified for CT-IC……………………………..... 63 Table 4 – Strength and weakness elements identified for CT-IR……………………………….. 65 Table 5 – Strength and weakness elements identified for CT-ID……………………………..... 66 Table 6 – Strength and weakness elements identified for RS-IC………………………………. 69 Table 7 – Strength and weakness elements identified for RSIR………………………………... 72 Table 8 – Strength and weakness elements identified for RS-IR……………………………….. 74 Table 9 – Strength and weakness balances for each system-factor combination………………. 77 Table 10 – Key Outcome: Strengths, Weaknesses, and Examples…………………………. 78-80 Barnes - 8 CHAPTER I. PURPOSE OF THE STUDY BRIEF PURPOSE The purpose of this study is to describe how explicitly collected and user-defined metadata may be used to enhance findability (Morville, 2005) in content tagging (Golder & Huberman, 2005) and recommender systems (Resnick & Varian, 1997) used in web-based social software applications (Morville and Rosenfeld, 2002). Findability is defined as the ability to locate desired information in web-based information systems through information classification, information retrieval, and information discovery methods (Morville, 2005). Content tagging and recommender systems are selected as the systems for evaluating the value of user-generated metadata due to their increasingly common role in social software design (Shneiderman, 2000). Metadata is data that describes data (Tannenbaum, 2001 and Stephens, 2003). It defines specific characteristics of information-bearing entities to enhance their identification, discovery, assessment, and management (Durrell, 1985). For purposes of this study, user-generated metadata is defined as information that users are prompted to assign to a particular digital artifact such as a document, product, media file, or other content for the purpose of describing, categorizing, or rating for future retrieval (Mathes, 2004). Through web-based applications designed to capture user descriptions of content, the descriptive data may be used as metadata and manipulated to enhance the usefulness of the information system (Kimball, 1998). When this personal metadata is shared and exposed to other users of an information system as a design strategy to aid information retrieval and discovery (Morville and Rosenfeld, 2002), the value of the metadata evolves to provide both a personal and social benefit (Mathes, 2004). Barnes - 9 Social software supports, extends, and derives its value from social behavior (Coates, 2005). The active participation of users in the process of information creation, classification, and discovery (Mathes, 2004) is a core characteristic of social software and social computing (Millen and Patterson, 2002). As such, the creation of user-generated metadata may be considered a collaborative and social activity when it is designed to support findability (Morville, 2005) for all users of an information system. This is the goal of most content tagging and recommender systems (Bielenberg & Zacher, 2005). In the search and content management fields, there is growing interest in content tagging tools that enable users to assign their own descriptive metadata to content using web-based tools so that information may be intuitively cataloged and more easily retrieved (Mathes, 2004). Similarly, recommender systems rely on a combination of user preferences and the collective ratings of items to produce relevant information (Resnick & Varian, 1997). The study is designed as a literature review and is exploratory, interpretive, descriptive, and qualitative (Leedy & Ormrod, 2005). Literature collection is limited to existing literature screened as part of a conceptual content analysis approach. Once sub-topics and delimitations are clearly defined and a research map finalized, a systematic search for relevant and meaningful literature related to predefined topics is conducted. A conceptual content analysis method (Palmquist, et al., 2006) is employed to identify strength and weakness elements related to three findability factors (information classification, retrieval, and discovery) as found in content tagging and recommender systems. This approach is appropriate to meet the goals and purpose of this study because it enables the collection and review of content related to predefined topical areas. Results of the content analysis process are presented in a series Barnes - 10 of six tables that display the strengths and weaknesses of the three search related factors – information classification, retrieval, and discovery – for each of the two social software systems selected for the study – content tagging and recommender systems. The strength and weakness elements of each factor are determined by their ability to enhance findability in content tagging and recommender systems. The primary outcome of this study is a reference resource presented as a comparative tool (see Table 10: Strengths, Weaknesses, and Examples of Content tagging and Recommender Systems). The tool is produced for user experience practitioners who design or are considering designing web-based social software applications that leverage user-generated metadata to enhance findability. Specific findability strengths, weaknesses, and application examples of content tagging and recommender systems are identified. This list will help user experience professionals assess the reported findability value of each system in classifying, retrieving, and discovering information. Barnes - 11 FULL PURPOSE There is value in information (Baily, 1997), but finding valuable information online is difficult due to the growing volume of data, diversity of file formats, and constantly changing nature of the medium (Morville, 2005). In an information-driven economy, contextually relevant information becomes an asset and form of currency (Bailey,