Affect-Based Indexing and Retrieval of Multimedia Data Ching Hau Chan
Total Page:16
File Type:pdf, Size:1020Kb
Affect-Based Indexing and Retrieval of Multimedia Data Ching Hau Chan A dissertation submitted in fulfillment of the requirements for the award of Doctor of Philosophy (Ph.D.) Supervisor: Dr. Gareth J.F. Jones School of Computing October 2006 Declaration I hereby certify that this material, which I now submit for assessment in the programme of study leading to the award of Doctor of Philosophy, is entirely my own work and has not been taken from the work of others save and to the extent that such work has been cited and acknowledged within the text of my own work. Signed : _______ ID No. : Date : 1 Title: Affect-based Indexing and Retrieval System of Multimedia Data Abstract Digital multimedia systems are creating many new opportunities for rapid access to content archives. In order to explore these collections using search, the content must be annotated with significant features. An important and often overlooked aspect of human interpretation of multimedia data is the affective dimension. The hypothesis of this thesis is that affective labels of content can be extracted automatically from within multimedia data streams, and that these can then be used for content-based retrieval and browsing. A novel system is presented for extracting affective features from video content and mapping it onto a set of keywords with predetermined emotional interpretations. These labels are then used to demonstrate affect-based retrieval on a range of feature films. Because of the subjective nature of the words people use to describe emotions, an approach towards an open vocabulary query system utilizing the electronic lexical database WordNet is also presented. This gives flexibility for search queries to be extended to include keywords without predetermined emotional interpretations using a word-similarity measure. The thesis presents the framework and design for the affect- based indexing and retrieval system along with experiments, analysis, and conclusions. Acknowledgements Firstly I would like to thank my supervisor, Dr. Gareth J.F. Jones. I could not have imagined having a better supervisor and mentor for my PhD, and without his kindness, knowledge, scrutiny, intuition, guidance and care over the years I would never have finished. I would like to thank my examiners, Dr. Noel E. O’Connor (internal) and Dr. Colin Johnson (external) for taking the time to read and examine the thesis as well as providing helpful feedback. Thanks to all the people in the Centre of Digital Video Processing at DCU especially Prof. Alan F. Smeaton, Dr. Hyowon Lee, and Dr. Bart Lehane for helpful guidance and advice over the years. Thanks to all the people who committed a significant amount of time to volunteer for the user experiments. I would also like to thank all the rest of the academic and support staff of the School of Computing and Dublin City University. On a more personal note, I would like to thank my parents for their patience and active support and encouragement throughout the years. Thanks to Mabel as well for offering her support and help as well. Finally, thanks to all my friends and family for helpful advice and words of encouragement. Table of Contents Page Number Declaration............................................................................................................................................. ii A b stract........................................................................................................................................................iii Acknowledgements..................................................................................................................................iv L ist o f F igures.........................................................................................................................................viii L ist o f T ab les..............................................................................................................................................xi C h a p te r 1 I n t r o d u c t io n ................................................................................................................... 1 1.1 Background and Motivation of This Thesis....................................................................... 1 1.1.1 Explosion o f Digital Video D a ta ...............................................................................1 1.1.2 Digital Video Delivery Through the Internet....................................................... 2 1.1.3 Emergence o f Video On-Demand D eliv ery .......................................................... 3 1.2 Significance o f Research Work in the T h esis...................................................................... 4 1.2.1 Information Overload................................................................................................. 4 1.2.2 Content-Based V ideo Retrieval on the A ffective L e v e l....................................5 1.2.3 Ambiguity and Incompleteness of Video Annotation and Search.................7 1.2.4 Video Indexing with Emotion Labels...................................................................... 8 1.3 Contributions............ ......................................................................................................................9 1.4 Thesis O verview .......................................................................................................................... 11 C h a p te r 2 D ig it a l V i d e o ...............................................................................................................12 2.1 Digital V ideo Concepts..............................................................................................................12 2.1.1 Introduction...................................................................................................................1 2 2.1.2 Digital and Analog V id eo.......................................................................................... 13 2.1.3 Digital Video Navigation and Brow sing...............................................................16 2.1.4 Colour Space Representation in Digital Video................................................... 17 2.2 Digital Audio Concepts.............................................................................................................23 2.2.1 Introduction.................................................................................................................... 23 2.2.2 Resolution and Channel o f Digital Audio............................................................24 2.2.3 Pulse-Code M odulation (PCM )...............................................................................24 2.2.4 Sampling Rate and Nyquist-Shannon Sampling Theorem..............................25 2.2.5 Representation of Digital Audio........................ ....................................................25 2.2.6 Fast Fourier Transform.............................................................................................. 27 2.2.7 Audio formats................................................................................................................28 2.3 A Review o f MPEG Compression..................................................................................... 29 2.4 MPEG-1 V id eo.............................................................................................................................30 2.4.1 Introduction.................................................................................................................... 30 2.4.2 I, P, and B Frames......................................................................................................32 2.4.3 M acroblock.....................................................................................................................33 2.4.4 Variable Length Coding.............................................................................................35 2.4.5 M otion Estimation and Compensation..................................................................37 v 2.5 MPEG-1 A udio............................................................................................................................39 2.5.1 Introduction....................................................................................................................39 2.5.2 Layer I, II and m .........................................................................................................39 2.5.3 Auditory M asking........................................................................................................40 2.5.4 MPEG Audio Encoder and Decoder......................................................................40 2.6 Summary........................................................................................................................................42 Chapter 3 Content Based Video Retrieval.............................................. 43 3.1 Introduction...................................................................................................................................43 3.2 Challenges for Content-Based Video Retrieval Systems.............................................. 44 3.3 Bridging the Semantic Gap with A ffective Computing.................................................46 3.4 The Three Levels o f Content Retrieval.............................................................................. 47 3.5 Examples o f Video Retrieval System s................................................................................48