Local descriptors
SIFT (Scale Invariant Feature Transform) descriptor
• SIFT keypoints at loca on xy and scale σ have been obtained according to a procedure that guarantees illumina on and scale invance. By assigning a consistent orienta on, the SIFT keypoint descriptor can be also orienta on invariant.
• SIFT descriptor is therefore obtained from the following steps: – Determine loca on and scale by maximizing DoG in scale and in space (i.e. find the blurred image of closest scale) – Sample the points around the keypoint and use the keypoint local orienta on as the dominant gradient direc on. Use this scale and orienta on to make all further computa ons invariant to scale and rota on (i.e. rotate the gradients and coordinates by the dominant orienta on) – Separate the region around the keypoint into subregions and compute a 8-bin gradient orienta on histogram of each subregion weigthing samples with a σ = 1.5 Gaussian SIFT aggrega on window
• In order to derive a descriptor for the keypoint region we could sample intensi es around the keypoint, but they are sensi ve to ligh ng changes and to slight errors in x, y, θ. In order to make keypoint es mate more reliable, it is usually preferable to use a larger aggrega on window (i.e. the Gaussian kernel size) than the detec on window.
The SIFT descriptor is hence obtained from thresholded image gradients sampled over a 16x16 array of loca ons in the neighbourhood of the detected keypoint, using the level of the Gaussian pyramid at which the keypoint was detected. For each 4x4 region samples they are accumulated into a gradient orienta on histogram with 8 sampled orienta ons.
• In total it results into a 4x4x8 = 128 dimensional feature vector
4x4 neighbourhood region 16x16 keypoint neighbourhood A a keypoint
ass 8 sampled orientations
SIFT Gradient orienta on histogram
• The gradient magnitudes are downweighted by a Gaussian func on (red circle) in order to reduce the influence of gradients far from the center, as these are more affected by small misregistra ons (smoothed by a Gaussian func on with σ equal to 1.5 the keypoint scale).
• The 8-bin gradient orienta on histogram in each 4 × 4 quadrant, a is formed by so ly adding the weighted gradient magnitudes.
Orientation histogram
Dominant direction
So distribu on of values to adjacent histogram bins is performed by trilinear interpola on to reduce the effects of loca on and dominant orienta on mises ma on
Image from: Jonas Hurrelmann SIFT illumina on invariance
• The keypoint descriptor is normalized to unit length to make it invariant to intensity change, i.e. to reduce the effects of contrast or gain.
• To make the descriptor robust to other photometric varia ons, gradient magnitude values are clipped to 0.2 and the resul ng vector is once again renormalized to unit length. • SIFT has been empirically found to show very good performance, invariant to image rota on, scale, intensity change, and to moderate affine transforma ons [Mikolajczyk & Schmid 2005]’s extensive survey : 80% Repeatability at: 10% image noise 45° viewing angle 1k-100k keypoints in database
Color SIFT descriptors
• Local descriptors like SIFT are usually based only on luminance and shape, so they use grey-scale values and ignore color, mainly because it is very difficult to select a color model that it sufficiently robust and general. Nevertheless, color is very important to describe/dis nguish objects or scenes
• Different types of descriptors can be combined to improve representa on; the most common combina on is between a local shape-descriptor (e.g. SIFT) and a color descriptor (e.g. color histogram in a “smart” color space like Luv or HSV)
An example of color-SIFT (sparse) descriptor (van de Weijer and Schmid, ECCV 2006). The combined descriptor is obtained by fusion of standard SIFT and Hue descriptor
Courtesy J. van de Weijer Dense gray and color SIFT
• SIFT descriptors can also be taken at fixed loca ons by defining a regular grid over the image. In this case at each center point, 128-d SIFT descriptors are computed. In this case the descriptor account for the distrubu on of the gradient orienta on but does not has scale invariance as obtained from the DoG detector. • Mul ple descriptors are therefore computed to allow for scale varia on
128-d SIFT
(128 x 3)-d SIFT SIFT descriptors in numbers
• 1 patch (1 SIFT descriptor) = 128 float = 128 * 4 byte (32bit) = 512 byte.
• 1 image 320x240 well textured ~ 600 keypoints= 600*512 byte = 307200 byte = 300 KB.
• This presenta on in memory= 51 slides = 300 KB*51 = 15300 KB ~ 15 MB.
….. SIFT: implementa ons available
• SIFT: available implementa ons: − David Lowe’s: first code of SIFT algorithm by its creator (only binary). − OpenCV 2.2 implementa on - wrapper from Vedaldi code. SIFT original Lowe’s algorithm is quite slow (~6 sec for an image of size 1280x768): it is computa onally expensive and copyrighted
− Implementa on by Rob Hess. Best ever − enta ons:
A C B
References: A.h p://www.cs.ubc.ca/~lowe/keypoints/ B.h p://opencv.willowgarage.com/documenta on/cpp/features2d_feature_detec on_and_descrip on.html#si C.h p://blogs.oregonstate.edu/hess/code/si / Rob Hess’ SIFT implementa on
Features: • Best Open Source SIFT implementa on in terms of speed, efficiency and similarity compared to Lowe’s binary. • SIFT library is wri en in C with versions available for both Linux and Windows • Easy to integrate with an OpenCV Project.
Contains a suite of algorithms: •SIFT keypoint detec on/descrip on •KD-tree matching •Robust plane-to-plane trasforma on
References: 1.h ps://web.engr.oregonstate.edu/%7Ehess/publica ons/si lib-acmmm10.pdf 2.h p://blogs.oregonstate.edu/hess/code/si /
Compile and Install: • Unix pla orms are preferred • On a debian-based simply: 1. sudo apt-get install build-essen al libgtk2.0-dev libcv-dev libcvaux-dev 2. cd