Using Social Annotations to Improve Web Search
Total Page:16
File Type:pdf, Size:1020Kb
USING SOCIAL ANNOTATIONS TO IMPROVE WEB SEARCH by Worasit Choochaiwattana B.B.A. (Marketing), Chulalongkorn University, Thailand, 1996 M.Sc. (Tech of Info Sys Mgt), Mahidol University, Thailand, 2000 M.Sc. (Computer Science), Chulalongkorn University, Thailand, 2003 Submitted to the Graduate Faculty of School of Information Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 20081 UNIVERSITY OF PITTSBURGH SCHOOL OF INFORMATION SCIENCES This dissertation was presented by Worasit Choochaiwattana It was defended on April 17, 2008 and approved by Peter Brusilovsky, PhD, Associate Professor, School of Information Sciences Daqing He, PhD, Assistance Professor, School of Information Sciences Stephen Hirtle, PhD, Professor, School of Information Sciences Brian Butler, PhD, Associate Professor, Joseph M. Katz Graduate School of Business and College of Business Administration Dissertation Advisor: Michael Spring, PhD, Associate Professor, School of Information Sciences 2 Copyright © by Worasit Choochaiwattana 2008 3 USING SOCIAL ANNOTATIONS TO IMPROVE WEB SEARCH Worasit Choochaiwattana, PhD University of Pittsburgh, 2008 Web-based tagging systems, which include social bookmarking systems such as Delicious, have become increasingly popular. These systems allow participants to annotate or tag web resources. This research examined the use of social annotations to improve the quality of web searches. The research involved three components. First, social annotations were used to index resources. Two annotation-based indexing methods were proposed: annotation based indexing and full text with annotation indexing. Second, social annotations were used to improve search result ranking. Six annotation based ranking methods were proposed: Popularity Count, Propagate Popularity Count, Query Weighted Popularity Count, Query Weighted Propagate Popularity Count, Match Tag Count and Normalized Match Tag Count. Third, social annotations were used to both index and rank resources. The result from the first experiment suggested that both static feature and similarity feature should be considered when using social annotations to re-rank search result. The result of the second experiment showed that using only annotation as an index of resources may not be a good idea. Since social Annotations could be viewed as a high level concept of the content, combining them to the content of resource could add some more important concepts to the resources. Last but not least, the result from the third experiment confirmed that the combination of using social annotations to rank the search result and using social annotations as resource index augmentation provided a promising rank of search results. It showed that social annotations could benefit web search. 4 TABLE OF CONTENTS PREFACE .................................................................................................................................... 16 1.0 INTRODUCTION ...................................................................................................... 18 1.1 DELICIOUS ....................................................................................................... 19 1.2 MOTIVATION .................................................................................................. 21 1.2.1 Using Social Annotations as Resource Indexes ........................................... 21 1.2.2 Search Result Ranking with Social Annotation .......................................... 22 1.3 RESEARCH OBJECTIVES ............................................................................. 23 1.4 ORGANIZATION OF THE PAPER ............................................................... 24 1.5 DEFINTIONS OF TERMS ............................................................................... 24 1.5.1 Annotation ...................................................................................................... 24 1.5.2 Anchor and Anchor text ............................................................................... 25 1.5.3 Link ................................................................................................................. 25 1.5.4 Metadata ......................................................................................................... 26 1.5.5 Collaboration ................................................................................................. 26 1.5.6 Folksonomy .................................................................................................... 26 1.5.7 Social Tagging/Social Annotation ................................................................ 26 1.5.8 Synonymy and Polysemy .............................................................................. 27 1.5.9 Noise ................................................................................................................ 27 5 1.5.10 Similarity Ranking and Static Ranking ..................................................... 27 1.5.11 Resource and URL ....................................................................................... 27 2.0 RELATED WORK .................................................................................................... 28 2.1 BACKGROUND ................................................................................................ 28 2.1.1 Annotation Forms and Functions ................................................................ 32 2.1.2 Purposes of Annotation ................................................................................. 34 2.1.2.1 Annotation for Memory ...................................................................... 34 2.1.2.2 Annotation for Communication ......................................................... 35 2.1.2.3 Annotation for Collaboration ............................................................ 36 2.1.2.4 Annotation for Description ................................................................ 37 2.2 DIGITAL ANNOTATION SYSTEMS ............................................................ 38 2.2.1 Systems that Used Annotations for Collaborative Purposes ..................... 39 2.2.1.1 NoteCard .............................................................................................. 39 2.2.1.2 InterNote .............................................................................................. 41 2.2.1.3 CASCADE ........................................................................................... 43 2.2.1.4 Collaborative Clinical Trial Protocol Writing System .................... 45 2.2.1.5 Col•laboració ....................................................................................... 46 2.2.2 Systems that Used Annotations for Memory and Communication Purposes ...................................................................................................................... 48 2.2.2.1 WebAnn ............................................................................................... 48 2.2.2.2 EDUCOSM .......................................................................................... 50 2.2.2.3 CritLink ............................................................................................... 51 2.2.2.4 Annotea ................................................................................................ 53 6 2.2.2.5 Microsoft Office ................................................................................... 55 2.2.2.6 Microsoft Research Annotation System ............................................ 56 2.2.3 Systems that Used Annotations for Description Purposes ......................... 57 2.2.3.1 YAWAS ................................................................................................ 57 2.2.3.2 Delicious ............................................................................................... 60 2.2.3.3 OntoAnnotate ...................................................................................... 61 2.2.3.4 Ont-O-Mat ........................................................................................... 62 2.2.3.5 Flickr .................................................................................................... 64 2.2.3.6 PhotoFinder ......................................................................................... 65 2.2.3.7 HBP Image Annotation Service ......................................................... 66 2.2.3.8 MADCOW ........................................................................................... 67 2.2.3.9 AntV ..................................................................................................... 68 2.2.3.10 VAnnotator ........................................................................................ 69 2.3 USING ANNOTATIONS TO IMPROVE SEARCH RESULTS .................. 70 3.0 PRELIMINARY ANALYSIS AND RESEARCH DESIGN .................................. 86 3.1 PROBLEM AND RESEARCH QUESTIONS ................................................ 86 3.2 PRELIMINARY ANALYSIS ........................................................................... 87 3.3 RESEARCH DESIGN ....................................................................................... 95 3.3.1 Evaluation metric .......................................................................................... 95 3.3.2 Experiment 1: Re-ranking search results .................................................... 96 3.3.2.1 Variables and Expected Results ......................................................... 96 3.3.2.2 The 1st