Multimedia Information Retrieval Copyright © 2010 by Morgan & Claypool
Total Page:16
File Type:pdf, Size:1020Kb
Multimedia Information Retrieval Copyright © 2010 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Multimedia Information Retrieval Stefan Rüger www.morganclaypool.com ISBN: 9781608450978 paperback ISBN: 9781608450985 ebook DOI 10.2200/S00244ED1V01Y200912ICR010 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON INFORMATION CONCEPTS, RETRIEVAL, AND SERVICES Lecture #10 Series Editor: Gary Marchionini, University North Carolina, Chapel Hill Series ISSN Synthesis Lectures on Information Concepts, Retrieval, and Services Print 1947-945X Electronic 1947-9468 Synthesis Lectures on Information Concepts, Retrieval, and Services Editor Gary Marchionini, University North Carolina, Chapel Hill Multimedia Information Retrieval Stefan Rüger 2009 Information Architecture: The Design and Integration of Information Spaces Wei Ding, Xia Lin 2009 Reading and Writing the Electronic Book Catherine C. Marshall 2009 Hypermedia Genes: An Evolutionary Perspective on Concepts, Models, and Architectures Nuno M. Guimarães, Luís M. Carriço 2009 Understanding User-Web Interactions via Web Analytics Bernard J. ( Jim) Jansen 2009 XML Retrieval Mounia Lalmas 2009 Faceted Search Daniel Tunkelang 2009 Introduction to Webometrics: Quantitative Web Research for the Social Sciences Michael Thelwall 2009 iv Exploratory Search: Beyond the Query-Response Paradigm Ryen W. White, Resa A. Roth 2009 New Concepts in Digital Reference R. David Lankes 2009 Automated Metadata in Multimedia Information Systems: Creation, Refinement, Use in Surrogates, and Evaluation Michael G. Christel 2009 Multimedia Information Retrieval Stefan Rüger The Open University SYNTHESIS LECTURES ON INFORMATION CONCEPTS, RETRIEVAL, AND SERVICES #10 &MC Morgan& cLaypool publishers ABSTRACT At its very core multimedia information retrieval means the process of searching for and finding multimedia documents; the corresponding research field is concerned with building the best possible multimedia search engines.The intriguing bit here is that the query itself can be a multimedia excerpt: For example,when you walk around in an unknown place and stumble across an interesting landmark, would it not be great if you could just take a picture with your mobile phone and send it to a service that finds a similar picture in a database and tells you more about the building — and about its significance for that matter? This book goes further by examining the full matrix of a variety of query modes versus document types. How do you retrieve a music piece by humming? What if you want to find news video clips on forest fires using a still image? The text discusses underlying techniques and common approaches to facilitate multimedia search engines from metadata driven retrieval, via piggy-back text retrieval where automated processes create text surrogates for multimedia, automated image annotation and content-based retrieval. The latter is studied in great depth looking at features and distances, and how to effectively combine them for efficient retrieval, to a point where the readers have the ingredients and recipe in their hands for building their own multimedia search engines. Supporting users in their resource discovery mission when hunting for multimedia material is not a technological indexing problem alone.We look at interactive ways of engaging with repositories through browsing and relevance feedback, roping in geographical context, and providing visual sum- maries for videos. The book concludes with an overview of state-of-the-art research projects in the area of multimedia information retrieval, which gives an indication of the research and development trends and, thereby, a glimpse of the future world. KEYWORDS multimedia information retrieval, multimedia digital libraries, visual search, content- based retrieval, piggy-back text retrieval, automated image annotation, audiovisual fin- gerprinting, semantic gap, polysemy, multimedia features and distances, fusion of fea- tures and distances, high-dimensional indexing, video summaries, information visual- isation, relevance feedback, geo-temporal browsing vii Contents Preface .........................................................................xiii 1 What is Multimedia Information Retrieval? .........................................1 1.1 Information Retrieval .......................................................1 1.2 Multimedia ................................................................3 1.3 Multimedia Information Retrieval ............................................4 1.4 Challenges of Automated Multimedia Indexing ...............................7 1.5 Summary ..................................................................9 1.6 Exercises .................................................................10 1.6.1 Memex 10 1.6.2 Loops and Interaction 11 1.6.3 Automated vs Manual 11 1.6.4 Compound Text Queries 12 1.6.5 SearchTypes 12 2 Basic Multimedia Search Technologies.............................................13 2.1 Metadata Driven Retrieval .................................................13 2.2 Piggy-back Text Retrieval ..................................................17 2.3 Content-based Retrieval ...................................................20 2.4 Automated Image Annotation ..............................................21 2.5 Fingerprinting ............................................................27 2.5.1 Audio Fingerprinting 28 2.5.2 Image Fingerprinting 32 2.6 Exercises .................................................................37 2.6.1 Search Types Continued 37 2.6.2 Intensity Histograms 37 viii CONTENTS 2.6.3 Fingerprint Block Probabilities 37 2.6.4 Fingerprint Block False Positives 38 2.6.5 Shazam’s Constellation Pairs 38 2.6.6 One Pass Algorithm for Min Hash 38 3 Content-based Retrieval in Depth .................................................41 3.1 Content-based Retrieval Architecture .......................................41 3.2 Features ..................................................................42 3.2.1 Colour Histograms 43 3.2.2 Statistical Moments 44 3.2.3 Texture Histograms 45 3.2.4 Shape 48 3.2.5 Spatial Information 54 3.2.6 Other Feature Types 57 3.3 Distances .................................................................57 3.3.1 Geometric Component-wise Distances 57 3.3.2 Geometric Quadratic Distances 59 3.3.3 Statistical Distances 60 3.3.4 Probabilistic Distance Measures 61 3.3.5 Ordinal and Nominal Distances 63 3.3.6 String-based Distances 64 3.4 Feature and Distance Standardisation........................................66 3.4.1 Component-wise Standardisation using Corpus Statistics 67 3.4.2 Range Standardisation 67 3.4.3 Ratio Features 68 3.4.4 Vector Normalisation 68 3.5 High-dimensional Indexing ................................................69 3.6 Fusion of Feature Spaces and Query Results..................................70 3.6.1 Single Query Example with Multiple Features 70 3.6.2 Multiple Query Examples 73 3.6.3 Order of Fusion 75 CONTENTS ix 3.7 Exercises .................................................................76 3.7.1 Colour Histograms 76 3.7.2 HSV Colour Space Quantisation 77 3.7.3 CIE LUV Colour Space Quantisation 78 3.7.4 Skewness and Kurtosis 79 3.7.5 Boundaries for Tamura Features 79 3.7.6 Distances and Dissimilarities 80 3.7.7 Ordinal Distances — Pen-pal Matching 80 3.7.8 Asymmetric Binary Features 80 3.7.9 Jaccard Distance 81 3.7.10Levenshtein Distance 81 3.7.11Co-occurrence Dissimilarity 81 3.7.12Chain Codes and Edit Distance 81 3.7.13Time Warping Distance 81 3.7.14Feature Standardisation 82 3.7.15Curse of Dimensionality 83 3.7.16ImageSearch 83 4 Added Services ..................................................................85 4.1 Video Summaries .........................................................85 4.2 Paradigms in Information Visualisation ......................................88 4.3 Visual Search and Relevance Feedback.......................................94 4.4 Browsing .................................................................99 4.5 Geo-temporal Aspects of Media ...........................................100 4.5.1 Importance of Geography as Context 100 4.5.2 Geo-temporal Browsing and Access 103 4.6 Exercises ................................................................104 4.6.1 Interface Design and Functionality 104 4.6.2 Shot Boundary Detection for Gradual Transitions 105 4.6.3 Relevance Feedback: Optimal Weight Computation 105 4.6.4 Relevance Feedback: Lagrangian Multiplier 106 x CONTENTS 4.6.5 Geographic Attributes 106 4.6.6 View of the World — Retrieval and Context 107 5 Multimedia Information Retrieval Research .......................................109 5.1 Multimedia Representation and Management ...............................109 5.1.1 GATE 111 5.1.2 AXMEDIS 111 5.1.3 NM2 112 5.1.4 3DTV 112 5.1.5 VISNET II 112 5.1.6 X-Media 113 5.1.7 MediaCampaign 113 5.1.8 SALERO 114 5.1.9 LUISA 114 5.1.10SEMEDIA 114 5.2 Digital Libraries..........................................................114 5.2.1 Greenstone 116 5.2.2 Informedia 117 5.2.3 SCULPTEUR 117 5.2.4 CHLT 118 5.2.5 DELOS 118 5.2.6 BRICKS 119 5.2.7 StoryBank 119 5.3 Metadata and Automated Annotation ......................................119 5.3.1 aceMedia 120 5.3.2 LAVA 121 5.3.3 KerMIT 121 5.3.4 POLYMNIA