Histogram of Directions by the Structure Tensor
Total Page:16
File Type:pdf, Size:1020Kb
Histogram of Directions by the Structure Tensor Josef Bigun Stefan M. Karlsson Halmstad University Halmstad University IDE SE-30118 IDE SE-30118 Halmstad, Sweden Halmstad, Sweden [email protected] [email protected] ABSTRACT entity). Also, by using the approach of trying to reduce di- Many low-level features, as well as varying methods of ex- rectionality measures to the structure tensor, insights are to traction and interpretation rely on directionality analysis be gained. This is especially true for the study of the his- (for example the Hough transform, Gabor filters, SIFT de- togram of oriented gradient (HOGs) features (the descriptor scriptors and the structure tensor). The theory of the gra- of the SIFT algorithm[12]). We will present both how these dient based structure tensor (a.k.a. the second moment ma- are very similar to the structure tensor, but also detail how trix) is a very well suited theoretical platform in which to they differ, and in the process present a different algorithm analyze and explain the similarities and connections (indeed for computing them without binning. In this paper, we will often equivalence) of supposedly different methods and fea- limit ourselves to the study of 3 kinds of definitions of di- tures that deal with image directionality. Of special inter- rectionality, and their associated features: 1) the structure est to this study is the SIFT descriptors (histogram of ori- tensor, 2) HOGs , and 3) Gabor filters. The results of relat- ented gradients, HOGs). Our analysis of interrelationships ing the Gabor filters to the tensor have been studied earlier of prominent directionality analysis tools offers the possibil- [3], [9], and so for brevity, more attention will be given to ity of computation of HOGs without binning, in an algo- the HOGs. rithm of comparative time complexity. There are many other features that can be the object of a comparative study using the structure tensor, but the Gabor filters and HOGs are the most widely used. The generalized Categories and Subject Descriptors Hough transform[1] has been discussed before in this context I.4.7 [image processing and computer vision ]: Feature ([2] Ch. 10.16, 11.6). Likewise, the Harris corner/singularity Measurement; I.5.0 [Pattern Recognition]: General detector[8] can be fully explained in terms of the structure tensor([2] Ch. 10.9) (rather than a presence-of-corner spec- General Terms ified by a corner model). It is straightforward to apply the principles outlined here Algorithms, Performance on several works, e.g. [10], and find that they are identical to the structure tensor, and varies in output through parameter Keywords selection. This does not lessen the value of these works, quite the contrary. They are unique in how they have been Histogram of Oriented Gradients, Structure Tensor, Com- derived, from valuable viewpoints. However, they all beg the plex Weighting question of what is the size of their common denominator. Can these descriptors be used to complement each other? If 1. INTRODUCTION so then how, and what is gained? Directionality can be defined in several ways. With this 1.1 Structure Tensor paper, we will make a case for using the structure tensor nd as the natural analytic tool to bind several definitions and The structure tensor (2 moment matrix), can be seen as derived algorithms together into one fold. In some cases, dif- three coarse descriptors of the distribution of the gradient: ferent directionality measures are identical, but a tool such as the structure tensor is required to show this. The theo- T E [IxIx] E [IxIy] G = E ∇I∇ I = retical exercise of doing this is worthwhile because it reduces E [IxIy] E [IyIy] redundancy in research (avoids duplicate terms for the same where Ix is the partial derivative of the image and E[·]in- dicates expected value[3]. The most significant eigenvector will yield the orientation of the directionality of the image. The amount of directionality can be defined in terms of the Permission to make digital or hard copies of all or part of this work for max min λ −λ |ρ | personal or classroom use is granted without fee provided that copies are eigenvalues as λmax+λmin = ˆ2(2) (this notation will be not made or distributed for profit or commercial advantage and that copies useful in later sections). Because the tensor is positive def- bear this notice and the full citation on the first page. To copy otherwise, to inite we have that |ρˆ2(2)|∈[0, 1], where maximum occurs republish, to post on servers or to redistribute to lists, requires prior specific for the maximally directional image(such as the left illustra- permission and/or a fee. ISABEL ’11, October 26-29, Barcelona, Spain tion of Fig. 1), and zero for an image of no directionality. Copyright 2011 ACM ISBN 978-1-4503-0913-4/11/10 ...$10.00. |ρˆ2(2)| = 0 occurs when there is no single preferred direc- 50 100 150 200 250 300 350 400 100 200 300 400 500 600 Figure 1: Left, Example of a highly directional im- age, with the gradients superimposed as arrows. Figure 2: Example of rosette like partitioning of the Right, the HOG of the image on the left, with tiny Log-Gabor filters. They give a sparse estimation of amounts of added white noise. the local spectrum. tion where energy is concentrated in the power spectrum(for 1.2 Gabor Filter Magnitudes example, when all directions have equal energy), or equiv- alently, the gradient distribution has principal directions of Widely used for directional analysis are the 2D Gabor equal variance (for example, no preferred direction in terms Filters [7],[11],[5]. Gabor filters are Gaussians tesselating of variance). the Fourier domain according to some uniform partition- In the following we will denote the image power-spectrum ing scheme. We will assume the Gabor filter partitioning as p(ω), where ω are the 2D frequency coordinates. Note scheme originally suggested in[6] which is similar to that of that this is with respect to the original image I(x)before [11] and is a perfectly direction-isotropic scheme, as illus- differentiation into ∇I(x). We will denote with f(ω)the trated in Fig. 2. Convolution with the image and the filter bivariate probability density function of ∇I, where it is un- can be thought of as a sequence of scalar products. On a spe- derstood that ω in this context is a 2D vector whose magni- cific position in the image, such a scalar product constitutes tude and direction-angle (r, θ) represent the magnitude and a coarse estimation of the local Fourier domain. The full set and direction of the gradient. These two functions will be of filter scalar products at a specific position estimates the analyzed in very similar ways in terms of their moments, yet local spectrum(coarsely) and is sometimes called the Gabor we want to be clear that the two entities are very different. Jet. The square magnitude, is the coarse estimation of the In one interpretation of the structure tensor the image is local power spectrum. p r, nπ analyzed on a topological torus and G analyzed as the sec- We will use the notation ¯( N ) to denote the Gabor r ond moment description of p(ω) [3]. In a nearly identical square magnitude, of radial frequency (tune-on) position , n N interpretation, G contains the second moment description and orientation (assuming a total number of orienta- R of f(ω). This implies that two quite different 2D entities, tions in the filter bank of radial frequency bands). In other r n will have equal second order moments. This is in fact not words and encode the coordinates of tune-on frequencies true in general! One can, however, easily show that this is in the Fourier domain. We will collapse the Gabor responses n always true if a toroidal topology is assumed. More specifi- to a function of only by a weighted averaging in the radial p nπ r2p r, nπ cally, we have: direction, and we will write simply ¯( N )= r ¯( N ) where we make explicit the ”abusive” notation ofp ¯(·)with p ·, · Lemma 1. A sufficient condition that the pdf of ∇I (f(ω)) one argument being the average of ¯( ) with two argu- share second order moments with the power-spectrum (p(ω)) ments. In the same spirit, this will be an estimation of p θ r3p r, θ dr of I(x)isthatE[∇I]=0. ( )= ( ) . Note that because of Hermitian sym- metry, p(θ)is periodic with period π. Proof. Consider the covariance matrix (the dispersion ma- trix) for f(ω): 1.3 Histogram of Oriented Gradients Measuring directionality can also be done by building a T T E ∇I∇ I − E[∇I]E[∇ I]= histogram of oriented gradients (HOG). For each gradient in an image a “bin” is increased in value. The angle of the gra- This is clearly equal to the structure tensor if and only if dient determines which bin, and the magnitude how much is E[∇I]=0. The central moment matrix of the power spec- added to it (Fig. 1 illustrates this). A histogram is a crude trum, on the other hand, will always equal the structure form of non-parametric density estimation. A generaliza- tensor, because all odd ordered moments vanish due to Her- tion is the Parzen window method[15] where many positions mitian symmetry[3]. (bins) in an angular vicinity are updated (this is often called kernel-based estimation of the histogram). Assuming that We will henceforth assume E[∇I] = 0 when discussing Ga- such estimation is performed, there is no technical compli- bor filters or the power spectrum interpretation of the struc- cation from grouping involved, i.e.