An Effective Approach for Querying Web Images using Text and Content Data

Thesis Proposal By XXXXXXXX Advisor : Prof. C.C Lee Mathematics and Comp Sci Dept, California State University, Hayward, 25800 Carlos Bee Boulevard, Hayward, CA 94542

Abstract Ever since the advent of Internet, there has been an immense growth in the amount of image data that is available on the World Wide Web. With such a magnitude of image availability, an efficient and effective image retrieval system is required to make use of this information. Several new techniques have been proposed in the recent years for retrieving images and among them, a key technique is the content-based approach that retrieves images based on various image features. However many of the existing content- based approaches are computationally intensive and not suited for the web image search.

This thesis presents an effective image matching and indexing technique that integrates both topic-based and content-based approach. Topic-based image retrieval is based on an improved text information retrieval (IR) technique that makes use of the structured format of HTML documents. A computationally less intensive technique based on color feature is used to perform content-based matching of images. The main goal is to develop a functional image search and indexing system and to demonstrate that better retrieval results can be achieved with this proposed hybrid search technique.

Background An exponential increase in the amount of image data on the Internet has brought the need for efficient and effective image search systems. This demand has made image retrieval a very active area of research in recent years [1][4][5][6][12]. Search technology may be the foundation of the Internet, but if one is looking for rich media content, today's text or keyword-based searches are inadequate. General-purpose search engines such as Google and AltaVista that are used today to search for images on the Internet mainly use the keyword or the text based approach.

This approach is not satisfactory, because the text-based description tends to be incomplete, imprecise, and inconsistent in specifying visual information [8][11][13][14] [15]. For photographs, graphics, logos, audio or video, text descriptions rarely convey any valuable information. For example, a search for a certain shade of blue sky is very difficult with textual descriptions. Using a text-only search engine, a query to look for photographs of tigers on the Web, would result in thousands of search results. And there is no easy way to ensure that the search results are relevant and worth looking at.

As a result, content-based image retrieval (CBIR) has received considerable research and commercial interest in the recent years. Most current content-based image retrieval techniques make use of features such as color, shape and texture of the image [1][3][6][7] [8][9][10]. But, this approach can have its limitations too. For example, a query that focuses only on texture would not be able to distinguish between a tiger and a zebra. Using image texture feature alone does not provide satisfactory results when used as the basis for querying images [3][6][15]. Classical shape recognition techniques tend to require that the object be clearly segmented from the rest of the image, a process that would require a lot of computations [2][8][14]. The main thrust of research in shape recognition has been for fixed, geometric objects in controlled images such as machine parts on a white background. Classical techniques can be easily applied in such cases, but they do not prove very useful in more general settings. Segmentation is imperfect in general cases because the shape, size, and color of objects contained in such photos can vary greatly [3].

Existing techniques have focused on image searches within client-server architectures. Specifically, images are retrieved from conventional databases, where network and access latencies are not issues. However, the Internet has grown to become a vast repository of multimedia information, all of which cannot be possibly stored within a single database. Also querying for an exact image is not so easy because of the sheer volume of image data to be processed to find a match. Hence, there is a need for techniques that would be intelligent enough to derive search results taking into consideration both image and textual information available on the Internet.

Research Approach This thesis presents an effective image matching and indexing technique that integrates query by topic and query by example specification methods. The proposed technique follows a two-phase approach as shown in Figure 1. The first phase uses the query by topic specification to perform a fast and high-level filtration according to the conceptual description associated with each image. The second phase uses the query by example specification to perform a low-level content-based image match for the retrieval of smaller and relatively closer results of the example image.

The main components and the technology used in the proposed system are as follows:

 Web Based UI to accept queries from the User  Focused Crawler for the text based filtration of web pages  Image Processor for Feature Computation  Indexer and retrieval of relevant images  Web Based display of image query results. All the above modules except the Focused Crawler will be designed and developed by us using Java, Java Advanced Imaging, JSP, Apache Tomcat and other open source technologies. The Focused Crawler used in the proposed system is a revised version of an existing one.

PHASE I: TEXT BASED APPROACH

Topic /Keyword Focused Link containing images related Description Crawler to the topic specified

Query Image Formation

Feature Computation

User Similarity Image Processor Comparator

Output

Display Indexing and Feature Retrieved result Retrieval Computation

PHASE II: CONTENT BASED APPROACH

Figure 1 – Proposed Text-Based And Content-Based Image Retrieval System

For the initial phase of text-based approach, we propose to accept keyword(s) and the query image from the user that will be provided to a focused crawler. The proposed focused crawler will search web pages for the specified keywords laying emphasis on their location within the HTML document, unlike the existing one. A suitable weighting scheme will enable the crawler to perform an efficient initial filtration.

Once the crawler comes up with a potential link to an image, the second phase of content- based image matching is initiated. The system converts the example image into an internal representation of features. We propose to use histogram-based technique in extracting and comparing color based feature information. Texture and shape-based matching algorithms involve computationally intensive operations like segmentation and transformation, which may not be suitable for a web crawler that is involved in searching the Internet. Hence, we will build upon the color based methods to provide a fast and efficient method to match images. After the second phase of content searching, we will categorize and present the search results in a user-friendly way. The proposed technique is not only computationally less intensive, which makes it ideal for the Internet but also results in fewer but accurate image query results.

Deliverables  User Interface for submitting the keyword and image to be queried  An implementation of the crawler and an implementation of the image indexer.  Analysis and comparison of the proposed approach versus existing techniques.  Thesis Presentation

References [1] H.D. Cheng and Y. Sun, “A Hierarchical Approach to Color Image Segmentation Using Homogeneity,” IEEE Transactions on Image Processing, Dec. 2001.

[2] A. Sajjanhar and G. Lu, "A grid based shape indexing and retrieval method", Special Issue of Australian Computer Journal on Multimedia Storage and Archiving Systems, November 1997, Vol.29, No.4, pp.131-140.

[3] S. Belongie et al. “Color- and Texture-Based Image Segmentation Using EM and its Application to Content-Based Image Retrieval”. In Proc. of Int. Conf. Comp. Vis. 1998.

[4] Markus Koskela, Jorma Laaksonen, and Erkki Oja (2001). “Comparison of Techniques for Content-Based Image Retrieva”, Proceedings of the 12th Scandinavian Conference on Image Analysis (SCIA 2001), Bergen, Norway, pp. 579–586.

[5] IEEE MultiMedia, “The Holy Grail of Content-Based Media Analysis”, IEEE MultiMedia, v.9 n.2, p.6-10, April 2002

[6] Andrea Casanova, Matteo Fraschini, Sergio Vitulano , “Context: A Technique for Image Retrieval Integrating CONtour and TEXTure Information(2002)” Proceedings of the XV Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’02)

[7] Wichian Premchaiswadi , Nucharee Premchaiswadi, Theianchai Patnasirivakin, Sutasinee Chimlek, and Seinosuke Narita Proc. “Image Indexing Technique and Its Parallel Retrieval on PVM VIIth Digital Image Computing: Techniques and Applications” 2003

[8] B.G.Prasad, S.K.Gupta and K.K.Biswas “Color and Shape Index for Region-Based Image Retrieval” 4th International Workshop on Visual Form IWVF4, Capri, Italy, May 2001. Proceedings published in Lecture Notes in Computer Science, LNCS 2059, Springer Verlag, pp. 716 -- 725. [9] Xiuqi Li, Shu-Ching Chen, Mei-Ling Shyu, and Borko Furht, "An Effective Content- Based Visual Image Retrieval System," Proceedings of the 26th IEEE Computer Society International Computer Software and Applications Conference (COMPSAC), pp. 914- 919, August 26-29, 2002, Oxford, England.

[10] Jianfeng Ren, Yuli Shen, Lei Guo “A Novel Image Retrieval Based on Representative Colors” Proceedings of IVCNZ 2003

[11] G. Lu and S. Teng, "A novel image retrieval technique based on vector quantization", Proceedings of International Conference on Computational Intelligence for Modelling, Control and Automation, 17-19 Feb. 1999, Viana, Austria, pp. 36-41.

[12] G. Lu and B. Williams, "An integrated WWW image retrieval system", Australian WWW Conference, 17-20 April, 1999.

[13] B. Furht and P. Saksobhavivat, "A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data," Proc. of SPIE Symposium on Multimedia Storage and Archiving Systems, Boston, MA, November 1998.

[14] Ardizzone, E., Chella, A., Pirrone, R., “Shape Description for Content-based Image Retrieva”l, Proc. of Fourth International Conference on Visual Information Systems VISUAL 2000, November 2-4 2000, Lyon, France, Springer-Verlag, 212-222.

[15] Pirrone, R., La cascia, M., “Texture Classification for Content-based Image Retrieval”, Edoardo Ardizzone, Vito di Gesù (eds.) ICIAP 2001 11th Internazional Conference on Image Analysis and Processing, September 22-28 2001, Palermo, Italy, 398-403. I understand that the idea of integrating text and context-based approach is something that has already been done before. In my proposal, I have included a reference to a paper [12] describing one such technique.

However, there are many ways in which this integration can be achieved. There are different ways to perform the text based query and similarly a number of ways to analyze and compare the image data using content-based approach. In the proposed system, the web crawler used to perform the text-based search is a Focused web crawler. The crawler not only provides for the user to enter the keyword and its scope but also internally uses the weight scheme to get much closer results. None of the existing image retrieval systems for the Internet that uses both text and content-based approach involves this kind of a crawler for text search.

Also, most of the recent work is geared towards image searches within client-server architectures [16]. Specifically, images are retrieved from conventional databases, where network and access latencies are not issues. Such systems use very complicated algorithms that are computationally expensive. So in that regard I propose to use a technique based on DCT coefficients that is not only effective but also efficient in retrieving images from the web. Again, this technique has nor been used by any image retrieval system for the World Wide Web.

1) Existing search technologies including Webseer and Google do not have capabilities for QBE - Query By Example . Webseer apart from the accepting the keyword from the user also gathers information related to the image like size, dimension, color etc manually. In the proposed system the information is extracted from the query image file. This kind of an approach definitely will result in better results

2)The web crawler used to perform the text based search in the proposed system is Focused web crawler. For example : if the user is searching for picture of birds in forest and not in books he can specify the keyword as follows - TREES AND (FOREST OR GARDEN) AND NOT BOOKS

This kind of a technique clubbed with weight allocation based on the location of the keyword in the HTML document results in much more accurate or relevant pages.

3)If there is an image A. of size 50 X 50 and if there is another link to image B with a different size, the proposed system basically scales the images to a fixed size before comparing getting more matches.

4) Existing search technologies rely on storing all the pages within a database - this can easily run into terabytes. I propose to minimize this using intelligent technique like DCT coefficients.