Winter Road Surface Condition Recognition Using a Pre-trained Deep Convolutional Neural Network Guangyuan Pan Postdoc Fellow, Department of Civil & Environmental Engineering, University of Waterloo, Waterloo, ON, N2L 3G1, Canada Email: [email protected] Liping Fu* Professor, Department of Civil & Environmental Engineering, University of Waterloo, Waterloo, ON, Canada, N2L 3G1; Intelligent Transportation Systems Research Center, Wuhan University of Technology, Mailbox 125, No. 1040 Heping Road, Wuhan, Hubei 430063 Email: [email protected] Ruifan Yu Master Student, David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada Email: [email protected] Matthew Muresan Ph. D. Student, Department of Civil & Environmental Engineering, University of Waterloo, Waterloo, ON, N2L 3G1, Canada Email: [email protected] additional qualitative details, but suffering from the ABSTRACT drawbacks of being subjective, labor-intensive, and time-consuming. On the other hand, RWIS stations This paper investigates the application of the latest benefit from providing continuous information on a machine learning technique – deep neural networks wide range of road and weather conditions; for classifying road surface conditions (RSC) based however, they are costly and can only be installed a on images from smartphones. Traditional machine limited number of locations, limiting their spatial learning techniques such as support vector machine coverage. In recent years, new technologies have (SVM) and random forests (RF) have been been developed to automate the RSC monitoring attempted in literature; however, their classification process, such as CCTV cameras, in-vehicle video performance has been less than desirable due to recorders, smartphone-based system, and high-end challenges associated with image noises caused by imaging systems (3-7). However, these solutions sunlight glare and residual salts. A deep learning have been found to be still limited in terms of model based on convolutional neural network working conditions and classification accuracy. (CNN) is proposed and evaluated for its potential to address these challenges for improved This research focuses on exploring the potential of classification accuracy. In the proposed approach applying one of the most successful machine we introduce the idea of applying an existing CNN learning models - deep learning for classifying model that has been pre-trained using millions of winter road surface conditions based on image images with proven high recognition accuracy. The data. In Section 1, this paper first describes the idea model is extended with two additional behind the proposed RSC monitoring system and fully-connected layers of neurons for learning the subsequently evaluates its performance as intended specific features of the RSC images. The whole in the field. Section 2 reviews the previous studies model is then trained with a low learning rate for that are related to this research, including image fine-tuning by using a small set of RSC images. recognition and deep learning. Section 3 introduces Results show that the proposed model has the the model that we propose. Section 4 describes the highest classification performance in comparison to data information, designs the experiment, discusses the traditional machine learning techniques. The model setting, and analyzes the results. The paper testing accuracy with different training dataset then concludes in Section 5. sizes is also analyzed, showing the potential of achieving much higher accuracy with a larger LITERATURE REVIEW training dataset. The image recognition problem has been studied INTRODUCTION extensively in the literature. Many machine learning models have been proposed in the past Winter road surface condition (RSC) monitoring is decades. In our previous research, three machine of critical importance for winter road maintenance learning techniques, artificial neural networks contractors and the traveling public. Real-time (ANN), random trees (RT) and random forests reliable RSC data can enable winter maintenance (RF) were evaluated using the images of bare, personnel to deploy the right type of maintenance partly snow-covered or fully snow-covered winter treatments with the right amount of deicing roads collected during the winter season of 2014. materials at the right time, leading to significant ANN, which is a black-box model, is commonly savings in costs and reduction in salts. The used to recognize patterns and model complex traveling public could make more informative relationships among variables (8-9). Random decisions on whether or not to travel, what mode to Forest (RF) is an ensemble of classification trees use and which route to drive for safer mobility. that produces a result based on the majority output RSC monitoring is traditionally done by manual from the individual trees, in which each tree is patrolling by highway agencies and maintenance constructed using a bootstrapped sample of the contractors or using road weather information total data set (random sampling with replacement) system (RWIS) stations (1-2). Manual patrolling (10). With a sufficient number of trees the provides very high spatial resolution with predictions tend to converge, resulting in a reliable algorithm that is relatively robust to outliers and Pre-trained deep CNN Model Structure noise. Instead of training a completely new CNN model, Deep learning (DL), or deep neural network which often requires a significant amount of data (DNN), is a novel machine learning technique that and computational time, an alternative approach has been widely explored and successfully applied would be to make use of a CNN model which has for a variety of problems such as applications in already been trained with proven performance image and voice recognition and games (11-12). In (19-20). Such a model would have already learned particular, convolutional neural networks (CNN) features that are useful for most computer vision have recently shown great success in pattern problems; leveraging such features would allow us recognition problems, such as large-scale image to reach a better accuracy than any method that and video analysis. This achievement results from would only rely on the available data. In this both the large public image repositories (such as research, we use a pre-trained deep CNN model ImageNet) and high-performance computing called VGG16 as has been introduced before. systems like GPU and the recent tensor processing unit (TPU) manufactured by Google (13-14). Since Figure 1(b) shows the overall structure of the CNN is becoming very common in many machine VGG16 deep CNN. The network can be divided learning fields, and better performances have been into five convolutional blocks and a fully achieved by improving the original architecture connected block. The first two convolutional and algorithms. For example, researchers have blocks contain two convolutional layers with a been proposing models with larger layers and receptive field with dimensions of 3 × 3 and deeper structures; however, the deeper the network convolutional kernels 64 and 128 respectively. The is, the more difficult the training process is (15-16). receptive field dimensions in this case refer to how In Karen Simonyan’s work, the model, VGG16, big an area (pixels) the next layer can observe from secured the first and the second places in the the previous layer. The number of convolutional localization and classification tracks respectively in kernels (have carefully been studied in their paper) ImageNet Challenge 2014 (17). However, deep can decide how many features a convolutional learning is generally a big data technology, thus, in layer can learn from its previous layer. The last studies which contain insufficient samples; deep three convolutional blocks contain three learning will struggle to learn useful features from convolutional layers in each block; they also have a the input. In these cases, the raw images often need receptive field with dimensions of 3 × 3, and the to be preprocessed like other traditional machine convolutional kernels are 256, 512 and 512, learning approaches would do. In our research, we respectively. The original raw images are first will present a simple yet effective method that resized into a three channeled (Red, Green, Blue) builds a powerful image classifier, using raw data image with dimensions of 150 × 150. The RGB from only a small set of training examples. Our values are then normalized by subtracting the RGB model is based on the model of VGG16 (17), value of each pixel by the mean RGB value of the which is pre-trained with learned features that are image. While this process results in some minor useful for most image recognition problems. information loss, it significantly helps reduce computation times. DEEP LEARNING MODEL During training, the image is passed through a Convolutional neural network (CNN) is one of the stack of convolutional layers, where only a very deep learning models that have been especially small receptive field of 3 × 3 is used. The successful for image classification. An example convolution step is fixed to 1 pixel. Spatial pooling structure of CNN is shown in Figure 1(a), which is is applied to the output at the end of each of the also included as one of the model options in our five convolutional blocks. This is a technique subsequent analysis. commonly used in computer vision which applies a statistical measure across a group of pixels by scanning a window of
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages13 Page
-
File Size-