Local Feature Extraction in Log-Polar Images

Local Feature Extraction in Log-Polar Images B Manuela Chessa( ) and Fabio Solari Department of Informatics, Bioengineering, Robotics and System Engineering - DIBRIS, University of Genoa, Via All’Opera Pia 13, 16145 Genova, Italy [email protected] Abstract. We propose two different strategies to compute edges in the log-polar (cortical) domain. The space-variant processing is obtained by applying local operators (e.g. local derivative filters) directly on the log- polar images, or by embedding the same operators into the log-polar mapping, thus obtaining a cortical representation of the Cartesian features. The two approaches have been tested by taking into consideration three standard algorithms for edge detection (Canny, Marr-Hildreth and Harris), applied onto the BSDS500 dataset. Qualitative and quantita- tive comparisons show a first indication of the validity of the proposed approaches. Keywords: Space-variant processing · Foveated images · Edge detection · Corner detection · Cortical representation · Bio-inspired visual processing 1 Introduction The computation of local image features such as edges and corners is at the basis of many approaches for matching and recognition, which are important in image processing, computer vision and robotics applications. In the literature, several edge or contour detection methods, based on local operators applied in the Cartesian domain, are described. Among them, some algorithms detect edges by convolving a grayscale image with local derivative filters (e.g. the Roberts, Sobel and Prewitt operators), the Marr and Hildreth method uses zero crossings of the Laplacian of Gaussian operator, and the Canny operator defines edges’ detection and localization criteria based on first derivatives of a Gaussian [5]. A combined edges and corners detector operator, based on local derivatives, has been proposed by Harris [8]. More recent local approaches take into account color and texture information and make use of learning techniques for cue combination [11]. Recently, a contour detector method that combines multiple local cues into a globalization framework, based on spectral clustering, is described in [1], and a multi-scale Harris corner detector is proposed in [7]. Though a great effort in improving edge detectors in the Cartesian domain has been done, few works address the same problem in the log-polar (cortical) domain [14]. The space-variant images are promising for many image processing c Springer International Publishing Switzerland 2015 V. Murino and E. Puppo (Eds.): ICIAP 2015, Part I, LNCS 9279, pp. 410–420, 2015. DOI: 10.1007/978-3-319-23231-7 37 Local Feature Extraction in Log-Polar Images 411 and robotics applications, since they provide a high spatial resolution in the region of interest, i.e. the fovea, and a reduction of the amount of data to be processed, similarly to what happens in the mammals’ visual system. Indeed, the distribution of the photoreceptors in the primates’ retina is space-variant (i.e. denser in the center, the fovea, and sparser in the periphery), and the projection of such photoreceptors into the primary visual cortex can be described by a log-polar mapping [14]. However, the processing of the log-polar images is a challenging task, due to the image distortions generated by the retino-cortical transform, which often require a specific adaptation of the algorithms, in order to properly work. Primal feature extraction [10] in log-polar images has been addressed by some authors in the literature, which propose ad-hoc solutions designed to work in the cortical domain. In [12], the authors present a mechanism for computing operators such as edge detection and Hough transform directly in foveated images; in [6] the authors present an approach to extract edges, bars, blobs and ends from log-polar images, based on neural networks that learn the feature’s class; and a comparison of several strategies for gradient detection in log-polar images is presented in [18]. In this paper, we propose two approaches that allow standard algorithms to work in the log-polar domain, without specific adaptation. Thus, we do not propose a new method for feature detection, but we analyze how well known state-of-the-art methods for edges and corners extraction work in the log-polar domain. In particular, the aim of the paper is to show the performances, also in terms of accuracy of the feature detection, of the two considered approaches: (i) the local operator for edge detection (i.e. derivative of Gaussian, and Lapla- cian of Gaussian) is applied on the cortical image, i.e. on the image that has been transformed into the log-polar image through low-pass Gaussian filters; (ii) the local operator is embedded into the log-polar transform, thus producing a cortical representation of the Cartesian derivatives of the image, on which to compute edges. Moreover, we assess the two proposed approaches by using the metrics and the BSDS500 dataset presented in [1]. 2 Log-Polar Mapping The log-polar mapping is a non linear transformation that maps each point of the Cartesian domain (x, y) into a cortical domain described by the coordinates (ξ,η). In the literature, several log-polar mapping models are described [2,4,9]. We consider the central blind-spot model, since it is characterized by scale and rotation invariance [17]. The log-polar transformation is described by the following equations: ρ ξ = loga ρ0 (1) η = qθ, where a parameterizes the non-linearity of the mapping, q is related to the angular resolution, ρ0 is the radius of the central blind spot, and 412 M. Chessa and F. Solari (ρ, θ)=( x2 + y2, arctan (y/x)) are the polar coordinates derived from the Cartesian ones. All points with ρ<ρ0 are ignored, thus ρ0 has to be small, with respect to the size of the image. In order to deal with digital images, given a Cartesian image of M ×N pixels, and defined ρmax =0.5 min(M,N), we obtain an R×S (rings × sectors) discrete cortical image of coordinates (u, v) by taking: u = ξ (2) v = η, where · denotes the integer part, q = S/(2π), and a = exp(ln(ρmax/ρ0)/R). Figure 1 shows the transformations through the different domains. The retinal area (i.e. the log-polar pixel) that refers to a given cortical pixel defines its receptive field (RF). By inverting Eq. 1 the centers of the RFs can be computed, and these points present a non-uniform distribution through the retinal plane, as in Figure 2a (green crosses). The optimal relationship between R and S is the one that optimizes the log-polar pixel aspect ratio γ, making it as close as possible to 1. It can be shown that, for a given R, the optimal rule is S =2π/(a − 1) [15,17]. Cartesian domain Cortical domain Retinal domain Fig. 1. Left: the cyan circle and the green sector in the Cartesian domain (x, y)mapto vertical and horizontal stripes, respectively, in the cortical domain (ξ,η). The red area represents a RF that is mapped in the corresponding cortical pixel. Right: an example of image transformation from the Cartesian to the cortical domain, and backward to the retinal domain. The RFs (yellow circles) are overlapping the Cartesian image. The specific choice of the mapping parameters is: R = 80, S = 131, ρ0 =3,andρmax = 256. The cortical image is scaled to improve the visualization. The shape of the RFs affects both the quality of the transformation and its computational burden. In [3] the authors analyze four techniques, each characterized by a different shape for the RFs: nearest pixel, bilinear interpolation, adjacent RFs, and overlapping circular RFs. The overlapping circular RFs [2,13] are the most biological plausible technique and they allow a better preservation of the image information [3], thus we consider this solution in the paper. To implement the log-polar mapping, the Cartesian plane is divided in two regions: the fovea and the periphery. The periphery is defined as the part of the plane in which the distance between the centers of two RFs on the same radius Local Feature Extraction in Log-Polar Images 413 is greater than 1 pixel (undersampling). To obtain the cortical image we use overlapping Gaussian RFs, as shown in Figure 2a. The fovea (in which we have an oversampling, i.e. the distance between two consecutive RFs is less than 1 pixel) is handled by using fixed size RFs, whereas in the periphery the size of the RFs grows. The standard deviation of the RF Gaussian profile is a third of the distance between the centers of two consecutive RFs, and the spatial support is six times the standard deviation. As a consequence of this choice, adjacent RFs overlap. A cortical pixel Ci is computed as a Gaussian weighted sum of the Cartesian pixels Pj in the i-th RF: Ci = j wij Pj, where the weights wij are the values of a normalized Gaussian centered on the i-th RF. A similar approach is used to compute the inverse log-polar mapping that produces the retinal image, where the space-variant effect of the log-polar mapping is observable. Gaussian RFs Laplacian of Gaussian RFs Derivatives of Gaussian RFs (a) (b) (c) Fig. 2. The RFs considered to obtain the cortical representation of the image. (a) Gaus- sian RFs, used to obtain the cortical image (log-polar transform). (b) Laplacian of Gaussian RFs and (c) Derivative of Gaussian RFs (along horizontal and vertical axes, respectively), used to obtain the cortical representation of the derivatives of the image. 3 Feature Detection in the Log-Polar Domain The cortical representation R(ξ,η) of a space-variant processed Cartesian image I(x, y) is described as follows: R(ξ,η)=g(x − x0(ξ,η),y− y0(ξ,η)),I(x, y), (3) where · denotes the inner product, g(x, y) is the local operator that defines the weights of the log-polar mapping, and (x0(ξ,η),y0(ξ,η)) is the center of each RF.

Load more