Local Descriptors
Total Page:16
File Type:pdf, Size:1020Kb
Local descriptors SIFT (Scale Invariant Feature Transform) descriptor • SIFT keypoints at locaon xy and scale σ have been obtained according to a procedure that guarantees illuminaon and scale invance. By assigning a consistent orientaon, the SIFT keypoint descriptor can be also orientaon invariant. • SIFT descriptor is therefore obtained from the following steps: – Determine locaon and scale by maximizing DoG in scale and in space (i.e. find the blurred image of closest scale) – Sample the points around the keypoint and use the keypoint local orientaon as the dominant gradient direc;on. Use this scale and orientaon to make all further computaons invariant to scale and rotaon (i.e. rotate the gradients and coordinates by the dominant orientaon) – Separate the region around the keypoint into subregions and compute a 8-bin gradient orientaon histogram of each subregion weigthing samples with a σ = 1.5 Gaussian SIFT aggregaon window • In order to derive a descriptor for the keypoint region we could sample intensi;es around the keypoint, but they are sensi;ve to ligh;ng changes and to slight errors in x, y, θ. In order to make keypoint es;mate more reliable, it is usually preferable to use a larger aggregaon window (i.e. the Gaussian kernel size) than the detec;on window. The SIFT descriptor is hence obtained from thresholded image gradients sampled over a 16x16 array of locaons in the neighbourhood of the detected keypoint, using the level of the Gaussian pyramid at which the keypoint was detected. For each 4x4 region samples they are accumulated into a gradient orientaon histogram with 8 sampled orientaons. • In total it results into a 4x4x8 = 128 dimensional feature vector 4x4 neighbourhood region 16x16 keypoint neighbourhood A a keypoint ass 8 sampled orientations SIFT Gradient orientaon histogram • The gradient magnitudes are downweighted by a Gaussian func;on (red circle) in order to reduce the influence of gradients far from the center, as these are more affected by small misregistraons (smoothed by a Gaussian func;on with σ equal to 1.5 the keypoint scale). • The 8-bin gradient orientaon histogram in each 4 × 4 quadrant, a is formed by soly adding the weighted gradient magnitudes. Orientation histogram Dominant direction SoX distribu;on of values to adjacent histogram bins is performed by trilinear interpolaon to reduce the effects of locaon and dominant orientaon mises;maon Image from: Jonas Hurrelmann SIFT illuminaon invariance • The keypoint descriptor is normalized to unit length to make it invariant to intensity change, i.e. to reduce the effects of contrast or gain. • To make the descriptor robust to other photometric variaons, gradient magnitude values are clipped to 0.2 and the resul;ng vector is once again renormalized to unit length. • SIFT has been empirically found to show very good performance, invariant to image rota.on, scale, intensity change, and to moderate affine transformaons [Mikolajczyk & Schmid 2005]’s extensive survey : 80% Repeatability at: 10% image noise 45° viewing angle 1k-100k keypoints in database Color SIFT descriptors • Local descriptors like SIFT are usually based only on luminance and shape, so they use grey-scale values and ignore color, mainly because it is very difficult to select a color model that it sufficiently robust and general. Nevertheless, color is very important to describe/dis;nguish objects or scenes • Different types of descriptors can be combined to improve representaon; the most common combinaon is between a local shape-descriptor (e.g. SIFT) and a color descriptor (e.g. color histogram in a “smart” color space like Luv or HSV) An example of color-SIFT (sparse) descriptor (van de Weijer and Schmid, ECCV 2006). The combined descriptor is obtained by fusion of standard SIFT and Hue descriptor Courtesy J. van de Weijer Dense gray and color SIFT • SIFT descriptors can also be taken at fixed locaons by defining a regular grid over the image. In this case at each center point, 128-d SIFT descriptors are computed. In this case the descriptor account for the distrubu;on of the gradient orientaon but does not has scale invariance as obtained from the DoG detector. • Mul;ple descriptors are therefore computed to allow for scale variaon 128-d SIFT (128 x 3)-d SIFT SIFT descriptors in numbers • 1 patch (1 SIFT descriptor) = 128 float = 128 * 4 byte (32bit) = 512 byte. • 1 image 320x240 well textured ~ 600 keypoints= 600*512 byte = 307200 byte = 300 KB. • This presentaon in memory= 51 slides = 300 KB*51 = 15300 KB ~ 15 MB. ….. SIFT: implementaons available • SIFT: available implementaons: − David Lowe’s: first code of SIFT algorithm by its creator (only binary). − OpenCV 2.2 implementaon - wrapper from Vedaldi code. SIFT original Lowe’s algorithm is quite slow (~6 sec for an image of size 1280x768): it is computaonally expensive and copyrighted − Implementaon by Rob Hess. Best ever − entaons: A C B References: A.htp://www.cs.ubc.ca/~lowe/keypoints/ B.htp://opencv.willowgarage.com/documentaon/cpp/features2d_feature_detec;on_and_descrip;on.html#siX C.htp://blogs.oregonstate.edu/hess/code/siX/ Rob Hess’ SIFT implementaon Features: • Best Open Source SIFT implementaon in terms of speed, efficiency and similarity compared to Lowe’s binary. • SIFT library is writen in C with versions available for both Linux and Windows • Easy to integrate with an OpenCV Project. Contains a suite of algorithms: •SIFT keypoint detec;on/descrip;on •KD-tree matching •Robust plane-to-plane trasformaon References: 1.htps://web.engr.oregonstate.edu/%7Ehess/publicaons/siXlib-acmmm10.pdf 2.htp://blogs.oregonstate.edu/hess/code/siX/ Compile and Install: • Unix plaorms are preferred • On a debian-based simply: 1. sudo apt-get install build-essen;al libgtk2.0-dev libcv-dev libcvaux-dev 2. cd <path-to-siX>/ && make all 3. ./bin/siXfeat -h Output: Usage: siXfeat [op;ons] <img_file> Opons: -h Display this message and exit -o <out_file> Output keypoints to text file -m <out_img> Output keypoint image file (format determined by extension) -i <intervals> Set number of sampled intervals per octave in scale space pyramid (default 3) -s <sigma> Set sigma for ini;al gaussian smoothing at each octave (default 1.6000) -c <thresh> Set threshold on keypoint contrast |D(x)| based on [0,1] pixel values (default 0.1400) -r <thresh> Set threshold on keypoint rao of principle curvatures (default 10) -n <width> Set width of descriptor histogram array (default 4) -b <bins> Set number of bins per histogram in descriptor array (default 8) -d Toggle image doubling (default on) -x Turn off keypoint display SIFT parameters to tune ( in si#.h or as argument of ./bin/si#feat) : − SIFT_CONTR_THR [0.04] is the default threshold on keypoint contrast |D(x)|. To high values correspond less but stronger keypoints and vice versa. −SIFT_DESCR_HIST_BINS [8] is the default number of bins per histogram in descriptor array. Trade-off between dis;nc;veness and efficiency. −SIFT_IMG_DBL [0/1] tells if the image must be resized twice before keypoint localizaon. Increase the detectable keypoints lowering performance. NN Matching parameters to tune ( in ./src/match.c) : − KDTREE_BBF_MAX_NN_CHKS [200] is the maximum number of keypoint NN candidates to check during BBF search. − NN_SQ_DIST_RATIO_THR [0.49] is the threshold on squared rao of distances between 1-st NN and 2-nd NN. Used to reject noisy matches in high dimensions. SIFT alternaves • GLOH (Gradient Locaon and Orientaon Histogram). – larger ini;al descriptor + PCA, TPAMI 2005 • SURF (Speeded Up Robust Features) – faster than SIFT and some;mes more robust. ECCV 2006. (343, Google citaons) • GIST – Rapid Biologically-Inspired Scene Classificaon Using Features Shared with Visual Aten;on, TPAMI 2007 • LESH – Head Pose Es;maon In Face Recogni;on Across Pose Scenarios, VISAPP 2008 • PCA-SIFT – A More Dis;nc;ve Representaon for Local Image Descriptors, CVPR 2004 • Spin Image – Sparse Texture Representaon Using Affine-Invariant Neighborhoods, CVPR 2003 GLOH (Gradient Locaon and Orientaon Histogram) descriptor • GLOH is a method for local shape descrip;on very similar to SIFT introduced by Miko in 2005 • Differently from SIFT it employs a log-polar locaon grid: – 3 bins in radial direc;on – 8 bins in angular direc;on – 16 bins for Gradient orientaon quan;zaon • The GLOH descriptor is therefore a higher dimensional vector with a total of 17 (i.e. 2x8+1) * 16 = 272 bins. PCA dimension reduc;on is employed in the vector representaon space Example 16 dim 2x8 +1 dim GLOH SURF (Speed Up Robust Features) descriptor • SURF is a performant scale and rotaon invariant interest point detector and descriptor. It approximates or even outperforms SIFT and other local descriptors with respect to repeatability, dis;nc;veness, and robustness, yet can be computed and compared much faster • This is achieved by: − relying on integral images for image convolu;ons − building on the strengths of the leading exis;ng detectors and descriptors (using a Hessian matrix-based measure for the detector, and a distribu;on-based descriptor) − simplifying these methods to the essen;al − The approach of SURF is similar to that of SIFT but ... – Integral images in conjunc;on with Haar Wavelets are used to increase robustness and decrease computaon ;me − Instead of iteravely reducing the image size (like in the SIFT approach), the use of integral images allows the up-scaling of the filter at constant cost “SIFT approach” “SURF approach” − The SURF descriptor computaon can be divided into two tasks: − Orientaon assignment − Extrac;on of descriptor components SURF integral