Vision‐Based Vehicle Speed Estimation: a Survey
Total Page:16
File Type:pdf, Size:1020Kb
Received: 14 January 2021 Revised: 31 March 2021 Accepted: 4 May 2021 IET Intelligent Transport Systems DOI: 10.1049/itr2.12079 REVIEW Vision-based vehicle speed estimation: A survey David Fernández Llorca1,2 Antonio Hernández Martínez1 Iván García Daza1 1 Computer Engineering Department, University of Abstract Alcalá, University Campus, Alcalá de Henares, The need to accurately estimate the speed of road vehicles is becoming increasingly impor- Madrid, Spain tant for at least two main reasons. First, the number of speed cameras installed worldwide 2 European Commission, Joint Research Center, Seville, Spain has been growing in recent years, as the introduction and enforcement of appropriate speed limits are considered one of the most effective means to increase the road safety. Second, Correspondence traffic monitoring and forecasting in road networks plays a fundamental role to enhance David Fernández Llorca, Computer Engineering traffic, emissions and energy consumption in smart cities, being the speed of the vehicles Department, University of Alcalá, University Cam- one of the most relevant parameters of the traffic state. Among the technologies available pus, Alcalá de Henares, 28805, Madrid, Spain. Email: [email protected] for the accurate detection of vehicle speed, the use of vision-based systems brings great challenges to be solved, but also great potential advantages, such as the drastic reduction Funding information of costs due to the absence of expensive range sensors, and the possibility of identifying Community Region of Castilla la Mancha, Spain, vehicles accurately. This paper provides a review of vision-based vehicle speed estimation. Grant/Award Number: CLM18-PIC-051; Com- munity Region of Madrid (Spain), Grant/Award The terminology and the application domains are described and a complete taxonomy of Number: S2018/EMT-4362 SEGVAUTO 4.0-CM; a large selection of works that categorizes all stages involved is proposed. An overview of Ministerio de Ciencia e Innovación, Grant/Award performance evaluation metrics and available datasets is provided. Finally, current limita- Number: DPI2017-90035-R tions and future directions are discussed. 1 INTRODUCTION The requirements concerning the uncertainty and robustness on the vehicle speed estimation from the infrastructure depends The ability to provide an accurate estimate of the speed of on the application scenario. On the one hand, for speed enforce- road vehicles is a key feature for Intelligent Transport Systems ment applications, extreme maximum permissible error stan- (ITS). This requires tackling problems such as synchronized dards are required by national or international certification agen- data recording, representation, detection and tracking, distance cies [1], which is more than reasonable considering that these and speed estimation etc., and it can be addressed using data automatic systems are used to fine, or even imprison, drivers from sensors of different nature (e.g. radar, laser or cameras) who exceed the maximum speed limit. This implies that the and under different lighting and weather conditions. most common sensors used for speed measurement in these This is a well-known topic in the field with considerable con- applications are based on high-precision high-cost range sen- tributions from both academia and industry. The most impor- sors, such as radar (Doppler effect) and laser (time of flight), tant application domains are speed limit enforcement, traffic or, in some cases, accurate on-pavement sensors (e.g. inductive monitoring and control, and, more recently, autonomous vehi- or piezoelectric loop detectors) embedded in pairs under the cles. However, it is important to note that the nature of the road surface. The use of visual cues to measure physical vari- problem of speed detection differs in a not insignificant way, ables related with the motion of the vehicles has been rarely depending on whether it is done from the infrastructure (fixed proposed. Video cameras usually play a secondary role as a sensors) or from a mobile platform (intelligent/autonomous means of capturing a human-understandable representation of vehicles or mobile speed cameras). Our study focuses on fixed the scene, allowing the visual identification of the vehicles, and systems from the infrastructure, although many of the presented serving as evidence for speed limit enforcement. On the other methodologies can be easily applicable to the case of mobile hand, for traffic monitoring, forecasting and control applica- sensors by compensating the speed of the ego-vehicle. tions, there is no regulatory framework or scientific consensus This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. IET Intelligent Transport Systems published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology IET Intell. Transp. Syst. 2021;1–00. wileyonlinelibrary.com/iet-its 1 2 FERNÁNDEZ LLORCA ET AL. on the accuracy and uncertainty of speed estimation [2]. The requirements regarding sensors are therefore reduced, and it is more common to see the use of less accurate sensors [3] includ- ing traffic cameras [4]. In any case, it is obvious that the higher the accuracy of the speed estimation, the better the forecasting and control approaches will be. Recent advances in computer vision techniques have led to a significant increase in the number of works proposing the use of vision as the only mechanism for measuring vehicle speed. The challenge of making accurate estimations of distances and speeds arises from the discrete nature of video sensors that project the 3D world into a 2D discrete plane. This intrin- sic limitation results in a digital representation whose accuracy FIGURE 1 Main high-level components of a vision-based vehicle speed estimation system: input data, object detection and tracking, distance and speed follows an inverse-square law, that is, its quantity is inversely estimation, and outcome proportional to the square of the distance from the camera to the vehicle. Despite these limitations, the potential bene- fits of using video cameras are remarkable, since they are a 2.1 Input data / stimuli cost-efficient solution compared with range sensors such as microwave Doppler radars or laser-based devices. The possibil- Vision-based speed detection approaches are based on input ity of using already installed traffic cameras without the need data from video cameras. For each vehicle there will be a to integrate new sensors is another advantage. Also, the visual sequence of images from the first appearance to the last one. cues from these sensors can be used to address other problems The number of images available will depend on the camera pose such as automatic identification of the vehicle (license plate, with respect to the road, the focal length, the frame rate, and the make, model and color), re-identification in different places, and speed of the vehicle. Two different types of video cameras can if the resolution allows, the identification of occupants, or seat be considered: belt use. The scope of this survey is vision-based vehicle speed detec- ∙ Traffic cameras: also referred as traffic surveillance cameras or tion from the infrastructure, including fixed point speed cam- traffic CCTV. They are used to monitor traffic flow, con- eras and traffic cameras, and excluding mobile speed cameras or gestion and accidents both manually (visual inspection by a onboard cameras for intelligent vehicle applications. Despite the human operator) or automatically. In most cases, these are potential impact, and the high number of contributions avail- fixed cameras located far from the traffic flow, infrastructure- able, there is still no detailed survey of this topic. We describe or drone-based. the terminology and the application domains. We survey a large ∙ Speed cameras: also referred as traffic enforcement camera. selection of works and propose a complete and novel taxon- They are generally considered to be a device for monitor- omy that categorizes all the components involved in the speed ing the speed of vehicles independently of the sensor used detection process. Performance evaluation metrics and available to measure the speed (radar, laser or vision). Even for radar- datasets are examined. Finally, current limitations are discussed or laser-based systems, there is always a camera to record an and major open challenges for future research are outlined. We image of the vehicle, and hence the term camera.Inthiscase, hope that this work will help researchers, practitioners and pol- we use the term speed camera in the most literal way, that is, icy makers to have a detailed overview of the subject that will be vision-based systems for measuring speed. Their location is useful for future work. usually closer to the traffic flow than the traffic cameras. The structure of this paper is as follows: we present the ter- minology and describe the application domains in Section 2,we Other forms of input data include vehicle attributes such as present a general description of the taxonomy in Section 3,and vehicle type, keypoints, license plate size etc. Camera calibration we review and analyze each of its components in Sections 4–7. plays a key role in providing both intrinsic and extrinsic param- The description of the available datasets and evaluation metrics eters. Prior knowledge of the size of the road section reveals are provided in Section 7. Discussion of the current state of the information that is critical to calculating the extrinsic relation- art and future trends are provided in Section 8, and, finally, Sec- ship between the road and the camera or even the speed of tion 9 concludes the presented work. the vehicles. 2 OVERVIEW AND TERMINOLOGY 2.2 Detection and tracking The main components of a fixed vision-based vehicle speed The vehicles, or some of their representative features, must be measurement system are summarized in Figure 1. Four levels detected in all the available images.