Expert Assisted Exploration of Photographs Supporting Users in Exploring Visual Media Through Subjective Aesthetic Attributes and Crowd-Sourced Tags
Total Page:16
File Type:pdf, Size:1020Kb
Expert assisted exploration of photographs Supporting Users in Exploring Visual Media through Subjective Aesthetic Attributes and Crowd-sourced Tags A dissertation submitted to the University of Dublin, in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Meltem G¨urel 2009 Copyright by Meltem G¨urel 2009 Declaration I, the undersigned, declare that this work has not previously been submitted to this or any other University, and that unless otherwise stated, it is entirely my own work. Meltem G¨urel Dated: September 11, 2009 Permission to Lend and/or Copy I, the undersigned, agree that Trinity College Library may lend or copy this thesis upon request. Meltem G¨urel Dated: September 11, 2009 Acknowledgements I want to express my genuine gratitude to my supervisor, Dr. Owen Conlan, for all the time spent assisting me and for all the encouragement and guidance offered. I would also like to thank Cormac Hampson for all his work and useful feedback, and Dr. Vincent Wade for the valuable suggestions. To my family and friends, and especially my fellow NDS classmates, I would like to thank you for being there even at the last minute. Meltem G¨urel University of Dublin, Trinity College September 2009 iv Abstract While digital technology extinguished the obligatory use of the photographic film as well as the time consuming chemical photograph development techniques, the num- ber of taken photographs has rapidly increased. As the computer harddisks as well as the photography websites substituted the photo albums, finding photographs from digital archives became difficult. Conventional image retrieval systems that seek to bridge the semantic gap, optimize photograph discovery based on associated tags or their low-level features, which can define the information regarding the content of a photograph, however not the photographs' aesthetic value. This dissertation investigates the possible benefits of augmenting traditional query techniques with subjective expert knowledge based on the manipulation of raw low-level data in order to empower users in exploring visual media. A novel tool, X2P hoto is presented which aims to enable users in retrieving photographs from large collections using not only objective tags but also subjective expertise based on photographs' color space. Evaluation results showed that conventional tag-based systems ignore the ap- preciative expressions hence limit the users to search via tagged simplifications of a photograph rather than the aesthetics of the photograph itself. Injecting expert knowledge into a conventional system that only offers tag-based searching, allows users to freely express both the aesthetics of the photograph they want as well as the picture it conveys. X2P hoto provided users an alternative pathway to access their large photograph collections. Initial user tests showed promise and indicate that this approach can be used to grant users the freedom they seek in relation to photograph discovery. v Contents Acknowledgements iv Abstract v List of Figures x Chapter 1 Introduction 1 1.1 Motivation . 1 1.2 Research Question . 4 1.3 Objectives . 5 1.4 Approach . 5 1.5 Thesis Outline . 6 Chapter 2 State of the Art 7 2.1 Introduction . 7 2.2 Attributing semantics to photographs . 7 2.2.1 Content Analysis . 8 2.2.2 Context Analysis - Exchange Information . 9 2.2.3 Context Analysis - Human Input . 9 2.3 Keyword-based semantic search . 10 2.3.1 Image Search Engines . 10 2.3.2 Flickr: A photograph-sharing site . 15 2.4 View-based semantic search . 18 2.5 Expert assisted knowledge discovery . 19 2.6 Color in photography . 20 vi 2.7 Analysis . 22 Chapter 3 Design 24 3.1 Introduction . 24 3.2 Requirements . 24 3.2.1 Analysis . 24 3.2.2 Manipulation of Tags . 25 3.2.3 Color Theory Integration . 25 3.2.4 Expert Vocabulary . 27 3.2.5 User Interface . 28 3.3 Architecture . 31 3.3.1 Presentation Layer . 31 3.3.2 Business Logic Layer . 32 3.3.3 Storage Layer . 33 3.3.4 System Flow . 34 3.4 UI for Exploration . 34 3.4.1 Tag Space . 35 3.4.2 Expert Knowledge . 35 3.4.3 Discovery Space . 36 3.4.4 Use Cases . 38 3.5 Summary . 39 Chapter 4 Implementation 41 4.1 Introduction . 41 4.2 Data Collection . 41 4.2.1 Photograph Repository . 42 4.2.2 Metadata Repository . 44 4.2.3 Processed Content Repository . 47 4.2.4 Domain Model . 50 4.3 Prototype . 50 4.3.1 The Semantic Attributes . 50 4.4 GUI Prototypes . 52 vii 4.4.1 TagBall . 52 4.4.2 AttBar . 53 4.4.3 Discovery Space . 53 4.5 Final System . 56 4.5.1 Technologies Used . 56 4.5.2 Architecture . 58 4.5.3 A demonstration of X2P hoto . 59 4.5.4 Analysis . 61 4.6 Summary . 62 Chapter 5 Evaluation 63 5.1 Introduction . 63 5.2 User Study . 63 5.2.1 Evaluation Setup . 63 5.2.2 User Group . 65 5.3 User Tests . 66 5.4 User Survey . 74 5.5 Analysis . 79 5.5.1 Overall Evaluation Results . 79 5.5.2 System-specific Requirements . 80 5.6 Summary . 81 Chapter 6 Conclusion 82 6.1 Summary . 82 6.2 Use Case . 83 6.3 Future Work . 84 Appendices 84 Appendix A Evaluation Questionnaire 85 Appendix B Domain Model 88 viii Bibliography 90 ix List of Figures 2.1 A digital photograph . 8 3.1 The HSL color space can be represented by a double cone showing the three axes of hue, saturation and luminosity. [Canon, 2009] . 26 3.2 WP-Cumulus' tag sphere . 29 3.3 Cooliris screen shot . 30 3.4 High Level Architecture . 31 3.5 Use Case Scenarios . 38 4.1 The original digital photograph . 48 4.2 Divided into regions . 48 4.3 Colors after processing . 49 4.4 PHP-based image search . 51 4.5 A conical spiral presentation of the photographs . 54 4.6 A cylindrical presentation of the photographs . 54 4.7 The interaction with individual photographs . 55 4.8 Final System Architecture . 58 4.9 Exploration . 59 4.10 Zoomed in View . 59 4.11 Viewing Details . 60 4.12 In Focus . 60 4.13 Favorites Area . 61 5.1 Photographs shown to users . 64 5.2 Similar photographs found using only the AttBar by a disagreeing user 67 x 5.3 Similar photographs found using only the AttBar by a disagreeing user 68 5.4 Similar photographs found using only the AttBar by agreeing users . 68 5.5 Similar photographs found using only the TagBall . 69 5.6 Similar photographs found using all features . 71 5.7 Similar photograph found in Flickr by a user . 73 5.8 Similar photograph found in Flickr by a user . 74 5.9 A photograph found in Flick with the tags \city, night, backstreet, USA" . 75 5.10 Similar photograph found in Flickr by a user . 75 xi Listings 4.1 Example Panda Call Response . 43 4.2 Example getInfo Call Response . 44 4.3 Example InformationDB XML . 45 4.4 Example ExifDB XML . 46 4.5 Example TagRepository DB . 47 4.6 Example ContentRepository DB . 49 5.1 Flickr response returning the tags related with sky . 70 B.1 Simplified Domain Model . 88 xii Chapter 1.