A Power Tool for Interactive Content-Based Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND VIDEO TECHNOLOGY Relevance Feedback A Power Tool for Interactive ContentBased Image Retrieval Yong Rui Thomas S Huang Michael Ortega and Sharad Mehrotra Beckman Institute for Advanced Science and Technology Dept of Electrical and Computer Engineering and Dept of Computer Science University of Illinois at UrbanaChampaign Urbana IL USA Email fyrui huanggifpuiucedu fortegab sharadgcsuiucedu Abstract ContentBased Image Retrieval CBIR has b e three main diculties with this approach ie the large come one of the most active research areas in the past few amountofmanual eort required in developing the anno years Many visual feature representations have b een ex tations the dierences in interpretation of image contents plored and many systems built While these research eorts and inconsistency of the keyword assignments among dif establish the basis of CBIR the usefulness of the prop osed approaches is limited Sp ecically these eorts haverela ferent indexers As the size of image rep ositories tively ignored two distinct characteristics of CBIR systems increases the keyword annotation approach b ecomes infea the gap b etween high level concepts and low level fea sible tures sub jectivityofhuman p erception of visual content This pap er prop oses a relevance feedback based interactive Toovercome the diculties of the annotation based ap retrieval approach which eectively takes into accountthe proach an alternative mechanism ContentBased Image ab ovetwocharacteristics in CBIR During the retrieval pro Retrieval CBIR has b een prop osed in the early s cess the users high level query and p erception sub jectivity Besides using humanassigned keywords CBIR systems use are captured by dynamically up dated weights based on the users feedback The exp erimental results over more than the visual content of the images such as color texture and images show that the prop osed approach greatly re shap e features as the image index This greatly alleviates duces the users eort of comp osing a query and captures the diculties of the pure annotation based approach since the users information need more precisely the feature extraction pro cess can b e made automatic and Keywords ContentBased Image Retrieval interactive multimedia pro cessing relevance feedback Since its ad the images own contentisalways consistent vent CBIR has attracted great research attention ranging from government industry to universities I Introduction Even ISOIEC has launched a new ITH advances in the computer technologies and the work item MPEG to dene a standard Web there has b een an adventoftheWorldWide W Multimedia Content Description Interface Many sp ecial explosion in the amount and complexity of digital data b e issues from leading journals have b een dedicated to CBIR ing generated stored transmitted analyzed and accessed and many CBIR systems both com Much of this information is multimedia in nature includ mercial and academic ing digital images video audio graphics and text data have b een develop ed recently In order to make use of this vast amount of data ecient Despite the extensive research eort the retrieval tech and eective techniques to retrievemultimedia information niques used in CBIR systems lag b ehind the corresp ond based on its contentneedtobedevelop ed Among the vari ing techniques in to days b est text search engines suchas ous media typ es images are of prime imp ortance Not only Inquery Alta Vista Lycos etc At the early stage it is the most widely used media typ e b esides text but it is of CBIR research primarily fo cused on exploring various also one of the most widely used bases for representing and feature representations hoping to nd a b est representa retrieving videos and other multimedia information This tion for each feature For example for the texture feature pap er deals with the retrieval of images based on their con alone almost a dozen representations have b een prop osed tents even though the approach is readily generalizable to including Tamura MSAR Word decomp osi other media typ es tion Fractal Gab or Filter and Wavelets Keyword annotation is the traditional image retrieval etc The corresp onding system design strat paradigm In this approach the images are rst anno egy for early CBIR systems is to rst nd the b est rep tated manually bykeywords They can then b e retrieved resentations for the visual features Then by their corresp onding annotations However there are During the retrieval pro cess the user selects the visual features that he or she is interested in In the case of This work was supp orted in part by NSFDARPANASA DLI Pro multiple features the user needs to also sp ecify the weights gram under Co op erative Agreemen t in part by ARL Co op erative Agreement No DAAL in part by NSF CISE for each of the features Research Infrastructure GrantCDA Yong Rui was also sup Based on the selected features and sp ecied weights the p orted in part byaCSEFellowship the University of Illinois Michael retrieval system tries to nd similar images to the users Ortega was also supp orted in part byCONACYT grant IEEE TRANSACTIONS ON CIRCUITS AND VIDEO TECHNOLOGY query such that the adjusted query is a b etter approximation to the users information need In the relevance We refer to such systems as computer centric systems feedback based approach the retrieval While this approach establishes the basis of CBIR the pro cess is interactive between the computer and human p erformance is not satisfactory due to the following two Under the assumption that highlevel concepts can b e cap reasons tured by lowlevel features the relevance feedback tech The gap between high level concepts and low level features nique tries to establish the link b etween highlevel concepts The assumption that the computer centric approach makes the users feedback Further and lowlevel features from is that the high level concepts to lowlevel features mapping more the burden of sp ecifying the weights is removed from is easy for the user to do While in some cases the assump the user The user only needs to mark which images he or tion is true eg mapping a high level concept fresh apple she thinks are relevant to the query The weights embed to low level features color and shap e in other cases this ded in the query ob ject are dynamical ly up dated to mo del may not b e true One example is to map an ancientvase the high level concepts and p erception sub jectivity with sophisticated design to an equivalent representation The rest of the pap er is organized as follows Section using low level features The gap exists between the two intro duces a multimedia ob ject mo del which supp orts levels multiple features multiple representations and their cor The subjectivity of human perception resp onding weights The weights are essential in mo deling Dierent p ersons or the same p erson under dierent cir high level concepts and p erception sub jectivity Section cumstances may perceive the same visual content dier discusses howthe weights are dynamical ly up dated based ently This is called human perception subjectivity on the relevance feedback to track the users information The sub jectivityexistsatvarious levels For example one need Sections and discuss the normalization pro ce p erson maybemoreinterested in an images color feature dure and dynamic weight up dating pro cess the two bases while another maybe more interested in the texture fea of the retrieval algorithm Extensive exp erimental results ture Even if b oth p eople are interested in texture the way over more than images for testing b oth the eciency how they perceive the similarity of texture may be quite and the eectiveness of the retrieval algorithm are given in dierent This is illustrated in Figure Section Concluding remarks are given in Section Among the ab ove three texture images some maysaythat a and b are more similar if they do not care for the II The Multimedia Object Model intensity contrast while others maysay that a and c are more similar if they ignore the lo cal prop erty on the seeds Before we describ e how the relevance feedback technique No single texture representation can capture everything can be used for CBIR we rst need to formalize howan Dierent representations capture the visual feature from image ob ject is mo deled An image ob ject O is repre dierent p ersp ectives sented as In the computer centric approach the b est features O O D F R and representations and their corresp onding weights are xed whichcannoteectively mo del high level concepts D is the raw image data eg a JPEG image and users p erception sub jectivity Furthermore sp eci F ff g is a set of lowlevel visual features asso ciated i cation of weights imp oses a big burden on the user as it with the image ob ject such as color texture and shap e requires the user to have a comprehensive knowledge of R fr g is a set of representations for a given feature ij the low level feature representations used in the retrieval f eg b oth color histogram and color moments are rep i system which is normally not the case resentations for the color feature Note that each rep resentation r itself maybeavector consisting of multiple Motivated by the limitations of the computer centric ap ij comp onents ie proach recent research fo cus in CBIR has moved to an interactive mechanism that involves a human as part of r r r r the retrieval pro cess Examples in ij ij ij k ij K clude interactive region segmentation interactiveim where K is the

A Power Tool for Interactive Content-Based Image Retrieval

Yong Rui Lenovo Group Dr

Sense Beauty Via Face, Dressing, And/Or Voice Tam Nguyen University of Dayton, [email protected]

Searching Personal Photos on the Phone with Instant Visual Query Suggestion and Joint Text-Image Hashing

Bilateral Correspondence Model for Words-And-Pictures Association in Multimedia-Rich Microblogs

Annual Report 2004 Nsa Tc, Ieee Cas Society

ACM Multimedia 2014

Call for Papers

38 Multimedia Search Reranking: a Literature Survey

2015 Annual Report (June 2014 – May 2015) Multimedia Systems and Applications Technical Committee IEEE Circuits and Systems Society

A Distributed Approach Towards Discriminative Distance Metric Learning Jun Li, Xun Lin, Xiaoguang Rui, Yong Rui Fellow, IEEE and Dacheng Tao Senior Member, IEEE

Boosting-Based Multimodal Speaker Detection for Distributed

2018 Annual Report (June 2017 – May 2018) Multimedia Systems and Applications Technical Committee IEEE Circuits and Systems Society