A Review of Color Image Segmentation Based on Visual Attention

2017 2nd International Conference on Computer Science and Technology (CST 2017) ISBN: 978-1-60595-461-5 A Review of Color Image Segmentation Based on Visual Attention Ning-yu ZHANGa, Qing LIU, Wen-zhu YANG*, Si-le WANG, Zhen-chao CUI, Li-pin CHEN and Xiang-yang CHEN College of Computer Science and Technology, Hebei University, Baoding 071002, China [email protected] *Corresponding author Keywords: Visual Attention, Color Image Segmentation, Saliency Abstract. Color image segmentation is a key technology in the field of computer vision. The quality of the segmentation results has a decisive influence on the quality of subsequent image analysis. In essence, computer vision is used a computer to replace the human vision to complete the target extraction and determination. Most of the traditional methods focus on ontology characteristics and ignore the role of human in the analysis process. In recent years, the visual saliency which simulates the process of the extraction of visual object by the human has been greatly improved in the field of image segmentation. Firstly, the visual attention calculation model is introduced. Then, the applications of visual attention calculation model in color image segmentation were summarized. Finally, the challenges in visual attention modeling at present were analyzed. Introduction Color image segmentation is an important issue in the field of computer vision. The results of the segmentation are affect the qualities of the subsequently image identification, analysis and understanding [1]. Color image segmentation divide an image into different areas, which have their own special meanings and there is no intersection between each other, far more and they should comply with the consistency conditions of the particular areas [2]. Color image segmentation methods using some of basic features (such as color, texture, intensity, orientation, and so on) divide the image into different areas [3]. So far, no segmentation methods are applicable for most situations; most methods are used in specific scenarios. Typical color image segmentation approaches, such as based on clustering, threshold, support vector machine (SVM) and so on, usually are complex and slow. Since Koch and Ullman proposed the concept of saliency in 1985, while many researchers began to study visual saliency. With the developing of the visual saliency, more researchers gradually concern to the methods which called color image segmentation based on visual saliency. These methods not only can make the segmentation results closer to the characteristics of Human Visual System (HVS), but also meet the human expectation and requirements, meanwhile improved the segmentation rate [4]. It is natural that the image under observation is divided into the object and background regions by human vision, so that we can put most visual attention on goals and ignores the background information [5]. Recently, visual saliency has caused wide public concern, for it provides a quick solution about some complex processing [6]. 231 Saliency originates from visual rarity, unpredictability, surprise, or uniqueness, and is often generated from variations in image attributes like gradient, boundaries, color, and edges. Visual saliency detection and segment the object regions, which are most prominent and significant in the whole scene. Extracted saliency map are widely used in many computer vision applications, such as object recognition, image segmentation, image retrieval, image editing, and so on [7]. Color image segmentation is a process that simulates the human vision perceptual system, separated object-of interest region. Color image segmentation based on visual attention, first of all, through the visual attention method to locating interested object, then use other methods for accurate segmentation. Using visual attention mechanism to achieve the complex background of color image segmentation, not only conform to the process of human vision perceptual system, but also make more sense to the process of human development, and become one of research hotspots in the field of machine learning for its algorithm rate is fast and accurate. This will lay a good foundation for machine learning. Visual Attention Computational Model The Driving Mode of Visual Attention Attention belongs to the concept of psychology and a part of cognitive process [8]. Attention has the branch of visual attention and auditory attention [9], the role of visual attention is quickly find interested target or area from a large amount of visual information. HVS can receive the number of visual data about 108 to 109 bits per second, said the study [10,11]. Visual attention, according to the process of forming, can be divided into two different types: the first one is bottom-up, which is rapid, task-independent and depends on the image low-level features information (for example, color, texture, intensity, orientation, and so on); the second one is top-down, which is slower, task-dependent, volition-controlled [12]. The illustration of two factors that guide human attention is shown in Fig.1. Figure 1. Examples of two ways of visual attention (Items in the first row that attract bottom-up attention include the vertical bar among horizontal bars and the red bar among gray bars (Images are from [13].). Pedestrians in the second row attract top-down attention (Images are from “Walking1” and “Walking2” sequences of [14].).). 232 Classification of Visual Attention Computational Model Visual attention computational model’s development originates from a modal so-called “feature integration theory,” that are proposed by Treismn and Gelade [15] in 1980. This visual theoretical model points out the important visual characteristics of visual attention, and states how these features are integrated to the allocation of attention. Then in 1998, Koch and Ullman [16] according to “feature integration theory” proposed the first comparative perfect calculation model of visual attention, namely ITTI model, and put forward the concept of “saliency map” for the first time. Since then the research of visual attention enters the fast development period, producing a lot of visual attention model, these algorithms are inspired by ITTI model. Cognitive Attention Models Actually, almost all attentional models are directly or indirectly enlightened by the model of “feature integration theory” raised by Treisma and Gelade [15], tells the important visual features, and how human attention are controlled by these visual features. Subsequently, Koch and Ullman proposed a feedforward model to combine these features, and introduced the concept of “saliency map” for the first time, which represented the salient regions in the scene. At the same time, they also introduced the "Winner-take-all" neural network to select the area of the most significant, and use a “Inhibition of return” suppress strategy allows the focus of attention to the next most significant area. The earliest cognitive attention model based on bottom-up proposed by Itti et al. [17] in 1998, which combines the human visual system and neural network, and further modified the model in 2003 [18], as shown in figure 2 is the architecture of visual attention computational model based on bottom-up put forward by Itti. Itti model is the most representative, and has become the standard of the bottom-up visual attention models. Input image Linear filtering colors intensity orientations Center-surround differences and normalization Feature maps （12 maps）（6 maps）（24 maps） Across-scale combinations and normalization Conspicuity maps Linear combinations Saliency map Inhibition of Winner-take-all return Attended location Figure 2. Architecture of Itti saliency-based visual attention model (The figure is adapted from [17]). 233 For an input image, this model extracts three primary visual characteristics: colors, intensity and orientations, using center-surround operation in a variety of scales to generate feature maps which reflect the probabilities of significant, and then merge three feature maps to obtain the final saliency map. In addition, using the biological “Winner-take-all” competition mechanism to get the most significant space position, which are used to guide the selection of attention’s location, finally using “Inhibition of return” method to complete the transfer of visual focus. An input image is subsampled into a Gaussian pyramid and each pyramid level is decomposed into channels for Red (R), Green (G), Blue (B), Yellow (Y), Intensity (I), and orientations (O). From these channels, center-surround “feature maps” fl are constructed and normalized for different features l. In each channel, maps are summed across scale and normalized again: 44c+ f =∀NflLLL(),,∈∪∪ (1) llcsICOcsc==+23,, =={ } { } =000 0 LILRGBYLIC, , , O{ 0 ,45 ,90 ,135} , (2) Then these maps are linearly summed and normalized once again to produce the “saliency maps”: CfCN==,(),( fCN = f ). (3) I lC lL∈∈l O lL l cO Finally, significant maps are linearly combined once again to produce the saliency map: 1 SC= . (4) 3 kICO∈{},, k This model’s four implementations: Saliency Toolbox (STB) by Walther [19], Matlab code by Harel [20], iNVT by Itti [17], and VOCUS by Frintrop [21]. These methods’ code can be found on the Internet. Most existing visual cognitive models are based on Itti model, for attention is in general unconsciously driven by low-level features such as contrast, colors, and intensity. These methods generally following three steps: first, feature extraction; second, saliency computation;

A Review of Color Image Segmentation Based on Visual Attention

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support