A Comparative Study of Stereovision Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 11, 2017 A Comparative Study of Stereovision Algorithms Elena Bebeşelea-Sterp Raluca Brad Remus Brad NTT DATA ROMANIA Faculty of Engineering Faculty of Engineering Sibiu, Romania Lucian Blaga University of Sibiu Lucian Blaga University of Sibiu Sibiu, Romania Sibiu, Romania Abstract—Stereo vision has been and continues to be one of Stereo vision has a wide range of applications nowadays, the most researched domains of computer vision, having many especially domains in which realistic object models are needed. applications, among them, allowing the depth extraction of a Depending on how important the processing time is, these scene. This paper provides a comparative study of stereo vision applications can be classified into two categories: static scene and matching algorithms, used to solve the correspondence description and dynamic scene description. For the first problem. The study of matching algorithms was followed by category, accuracy is of higher importance compared to the experiments on the Middlebury benchmarks. The tests focused processing time. Usually, image pairs are acquired by means of on a comparison of 6 stereovision methods. In order to assess the a special device and reconstructed afterwards for cartography, performance, RMS and some statistics related were computed. In crime scene reconstruction, car crashes scene reconstruction, order to emphasize the advantages of each stereo algorithm 3D models for architecture. In dynamic scene description real- considered, two-frame methods have been employed, both local and global. The experiments conducted have shown that the best time processing of the data is critical. Of course, a certain level results are obtained by Graph Cuts. Unfortunately, this has a of accuracy must be fulfilled. Possible applications are obstacle higher computational cost. If high quality is not an issue in (e.g. pedestrians) avoidance, autonomous navigation in which applications, local methods provide reasonable results within a an internal map is continuously updated, height estimation, etc. much lower time-frame and offer the possibility of parallel This paper aims to review different stereo methods implementations. available and compare several stereo matching algorithms across sets of images. Section 2 describes the fundamental of Keywords—Stereo vision; disparity; correspondence; comparative study; middlebury benchmark stereo correspondence that makes possible the depth estimation, and discuss some of the stereo methods available nowadays, I. INTRODUCTION classifying them into local, global and semi-global algorithms, while Section 3 analyses in more detail some referenced Stereovision is an area of computer vision focusing on the algorithms considered for experiments. The results are extraction of 3D information from digital images. The most discussed in Section 4. researched aspect of this field is stereo matching: given two or more images as input, matching pixels have to be found across II. A CLASSIFICATION OF STEREOVISION ALGORITHMS all images so that their 2D positions can be converted into 3D depths, producing as result a 3D estimation of the scene. As This section reviews some of the stereo correspondence many other breakthrough ideas in computer science, algorithms that have been proposed over the last decades. stereovision is strongly related to a biological concept, namely These can be classified depending on multiple criteria such as stereopsis, which is the impression of depth that is perceived the method which assign disparities to pixels, occlusion when a scene is viewed by someone with both eyes and normal handling, color usage, matching cost computation etc. binocular vision [1]. By aid of stereoscopic vision, we can A. Local Methods perceive the surroundings in relation with our bodies and detect Local methods assign disparities depending on the objects that are moving towards or away from us. While the information provided by the neighboring pixels, usually fast entire process seems an easy task for us and other biological and yield good results. One broad category of local methods is systems, the same does not apply to computer systems. Finding represented by the block matching algorithms, which try to the correspondences across images is a challenging task. The find a correspondence for a point in the reference image by earliest stereo matching algorithms were developed in the field comparing a small region surrounding it with small regions in of photogrammetry for automatically constructing topographic the target image. This region is reduced to a single line, called elevation maps from overlapping aerial images [2]. the epipolar line. Block matching algorithms are used not only Stereo matching has been one of the most studied topics, in stereo vision, but also in visual tracking and video starting with the work of D. Marr and T. Poggio [3] which compression. focuses on human stereopsis. Lane and Thacker’s study on Among the first techniques which appeared is box filtering. stereo matching [4] presents a dozen algorithms from 1973 up This involves replacing each pixel of an image with the to 1992, but no comparison between them was made. Another average in a box and it is an efficient general purpose tool for important study is the one done by Scharstein and Szeliski [5], image processing. In [6] a procedure for mean calculation has in which they have compared several algorithms and their been proposed. The advantage of box filtering is the speed, performance based on different metrics. cumulating the sum of pixel values along rows and columns. 359 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 11, 2017 This way, the sum of a rectangle is computed in a linear time, Another class of local algorithms is represented by the independent of its size. gradient-based methods, also known as optical flow. In this case, the matching costs are insensitive to bias and camera Another algorithm that achieves high speed and reasonable noise. These methods determine the disparities between two quality is the one proposed by Mühlmann et al. [7]. It makes images by formulating a differential equation relating motion use of rectified images and can be used by other, more and image brightness. In order to do this, the assumption is sophisticated, algorithms that need an initial guess at the made that the image brightness of a point in the scene is disparity map. In order to eliminate false matches it makes use constant between the two views [16]. of the left-right consistency check and uniqueness validation. One of the earliest methods for optical flow estimation is A new block matching algorithm has been proposed in [8] the one developed by Kanade and Lucas [17]. This solves the and even though the focus of this paper was on video flow equations for all the pixels in the neighborhood using the compression, the algorithm can be applied also to stereo vision. least squares criterion. However it cannot provide flow This is based on a winner-update strategy which uses a lower information inside regions that are uniform. This method can bound list of the matching error to determine the temporary be used in many applications of image registration, including winner. stereo vision. More similar information can be found in [18]. Efficient usage of color information can improve results, A newer algorithm from the same class is the one proposed therefore, some of the best stereo correspondence algorithms by Zhou and Boulanger [19]. This is based on relative use color segmentation techniques. One of the most popular gradients in order to eliminate the radiometric variance. Most color segmentation algorithms has been proposed in [9], based stereo matching methods assume that the object surface is on the mean-shift algorithm which dates back to 1975, but Lambertian, meaning that the color for every point in the scene extended on computer vision only later on. The algorithm will be constant in the views captured by two separate cameras. proposed in [10] is one of the top performing algorithms in the However, to most real-world objects this does not apply, Middlebury classification [11] and it makes use of this reflecting light that is view dependent. The algorithm is able to segmentation technique. This is based on inter-regional deal with both view dependent and independent colors and cooperative optimization. The algorithm uses regions as both color and gray scale images. matching primitives. More exactly, they use color statistics of the regions and constraints on smoothness and occlusion Block matching and gradient-based methods are sensitive between adjacent regions. A similar algorithm, that is currently to depth discontinuities, since the region of support near a one place above the before mentioned [10], is the algorithm discontinuity contains points from more than one depth. These proposed by Klaus et al. in [12]. Firstly, homogenous regions methods are also sensitive to regions of uniform texture [16]. are extracted from the reference image, also with the method Another class of algorithms that aims to overcome these [9]. Secondly, local windows-based matching is applied using drawbacks is represented by the feature-matching algorithms, a self-adapting dissimilarity measure that combines SAD and a which limit the support region to reliable features in the image. gradient based measure. Using the reliable correspondences,