A Genetic Algorithm for Convolutional Network Structure Optimization for Concrete Crack Detection
Total Page:16
File Type:pdf, Size:1020Kb
A Genetic Algorithm for Convolutional Network Structure Optimization for Concrete Crack Detection Spencer Gibb, Hung Manh La, and Sushil Louis Abstract— A genetic algorithm (GA), is used to optimize the explanation of crack detection and CNNs. Related work is many parameters of a convolutional neural network (CNN) also provided to help show the novelty of the method. that control the structure of the network. CNNs are used in image classification problems where it is necessary to generate A. Image Classification feature descriptors to discern between image classes. Because of the deep representation of image data that CNNs are capable Image classification is used in many applications including of generating, they are increasingly popular in research and face recognition, object detection, and scene recognition [2], industry applications. With the increasing number of use cases [3], [4]. Typically, the image classification process involves for CNNs, more and more time is being spent to come up modeling a set of training data so that it is possible to discern with optimal CNN structures for different applications. Where one CNN might succeed at classification, another can fail. As between two or more classes of objects or images using a a result, it is desirable to more easily find an optimal CNN classifier. Then, once the data is modeled, the classifier can structure to increase classification accuracy. In the proposed be applied to a new set of data in an application setting. One method, a GA is used to evolve the parameters that influence example of a real-world image classification task is crack the structure of a CNN. The GA compares CNNs by training detection in images of concrete [5], [6]. Locating cracks in them on images of concrete containing cracks. The best CNN after several generations of the GA is then compared to the images of concrete can be useful in applications such as civil state-of-the-art CNN for crack detection. This work shows that infrastructure inspection, where it is desirable to be able to it is possible to generalize the process of optimizing a CNN for perform inspection of parking garages, bridges, and other image classification through the use of a GA. structures in a fully autonomous manner [7], [8], [9]. Using the location of cracks found in images from an on-board I. BACKGROUND AND RELATED WORK camera, it is possible to offer a convenient and cost-effective In the proposed method, a GA is used to optimize the map of the area. parameters of a CNN. The goal of this method is to Crack detection in concrete and pavement has been at- optimize the performance of the CNN at detecting cracks tempted with many image processing, computer vision, and in concrete surface. Concrete crack detection is a type of machine learning techniques. Initial attempts to detect cracks image classification problem where classic image analysis in images of concrete were performed using methods such methods have failed due to the various difficulties in the as the Canny edge detector, Sobel filter, Laplace of Gaus- crack detection process, including: illumination changes, sian (LoG) [10], [11] for edge detection, and other simple background clutter in images, and the shape of the cracks. methods. These methods do not perform well in cases where CNNs have shown that they are capable of attaining high the images are noisy, have shadows, or have textured back- accuracy at crack detection in [1], but choosing a CNN grounds. In an attempt to improve crack detection accuracy structure that performs well is a time consuming process compared to these classic methods, newer modeling and due to the number of parameters that affect the network learning techniques have been applied to crack detection in structure. For this reason, a GA is applied to search the recent research. Among these techniques are random forest space of possible network structures. In this section, a review classifiers, neural networks, percolation-based edge detec- of general image classification is provided, as well as an tion, frequency domain filtering, and CNNs [12], [13], [14], [15], [1]. While modern work tends to show high accuracy Financial support for this study was provided by the U.S. Department of in crack classification, various models and techniques work Transportation, Office of the Assistant Secretary for Research and Technol- better or worse depending on the test environment, camera ogy (USDOT/OST-R) under Grant No. 69A3551747126 through INSPIRE University Transportation Center (http://inspire-utc.mst.edu) at Missouri system, lighting, the presence of other objects in the image, University of Science and Technology. The views, opinions, findings and and the size and depth of cracks in the test images [16]. conclusions reflected in this publication are solely those of the authors and Some of our previous work has focused on implementing do not represent the official policy or position of the USDOT/OST-R, or any State or other entity. Dr. Louis was supported by grant number N00014-17- the CNN in [1] on a robotic platform designed at the Univer- 1-2558 and N00014-15-1-2015 from the Office of Naval Research. Any sity of Nevada, Reno campus. This network is the best crack opinions, findings, and conclusions or recommendations expressed in this detection method, prior to the proposed method, that we are material are those of the author(s) and do not necessarily reflect the views of the Office of Naval Research. currently aware of and is capable of being run in quasi- Spencer Gibb and Dr. Hung La are with the Advanced Robotics and real time. However, because of the conditions discussed Automation (ARA) Laboratory. Dr. Sushil Louis is with the Evolutionary previously and the difficulty of the crack detection problem, Computing Systems Lab (ECSL), Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557, USA. Corresponding this network is still not capable of detecting cracks reliably author: Hung La, email: [email protected] under difficult circumstances. For these reasons, a more 978-1-5090-6017-7/18/$31.00 ©2018 IEEE The last type of layer that is used often in CNNs is a fully-connected layer. This type of layer works in a similar way to a multilayer perceptron network, which is a classic type of neural network. In fully-connected layers, each input node is connected to each output node directly, and a weight is associated with each connection. These weights are Fig. 1: An example of max-pooling. The input (left) is split optimized using the back-propagation process [17], which into four parts and each part is searched for the maximum will not be discussed here for the sake of space; however, local value. The output (right) contains the maximum values an output node in the fully-connected layer sums its value found in the search. as follows: Output = ω1V1 + ... + ωN VN , (2) robust solution to the crack detection problem is desired. where there are N nodes connected to the output node in While deep learning - and more specifically CNNs - provide Equation (2), and each node has an associated weight, ωi, the best results of any current learning method, constructing and value, Vi, where i =1, ..., N. Fully-connected layers an optimal network is an extremely time consuming task are typically connected to the end of a CNN as a final way due to the size of the search space, which requires a brief to introduce non-linearity into the network. However, large introduction to CNNs in order to understand. fully-connected layers need to be used with caution since they can quickly introduce many weights to be optimized B. Convolutional Network Structure and greatly increase the amount of training time required for the network. This is a result of the N 2 weights that need to There are several types of layers that make up the structure be optimized for each fully-connected layer. of CNNs including: convolution layers, pooling layers, and fully-connected layers. Convolution layers perform multipli- C. Application of a Genetic Algorithm cation between the input image, or two-dimensional array, CNNs require a large number of parameters to be op- and a filter that is also a two-dimensional array. This mul- timized function at peak performance, which can lead to tiplication happens at every location in the input array and long training times for networks and often require a deep the result is then summed. This process yields another two- understanding of the application and data in order to be used dimensional array and can be described mathematically as successfully. The list of parameters that must be considered follows: in order to construct a CNN include: the number of layers s s in the network, the activation functions used in the network, 2 2 f[x, y] ∗ g[x, y]= f[i, j] · g[x − i, y − j], (1) the size of layer operations (for convolution and pooling/sub- s s sampling), the number of layers used in each convolution, i=− 2 j=− 2 the training parameters for the optimizer, and more. This where f is the input image, or array, g is the filter, s is the results in an infinite number of possible network structures, size of the filter, which is a square filter, and x and y are and it is often too time consuming to get good results for locations in the input arrays. However, Equation (1) typically image classification problems because of the complexity of happens many times in each convolution layer but with constructing deep learning networks. To illustrate this point, different filters used for g. Thus, there can be a set of filters the network structure used in [1] is shown in Figure 2.