remote sensing

Article Fast Automatic Airport Detection in Remote Sensing Images Using Convolutional Neural Networks

Fen Chen 1,2,*, Ruilong Ren 1, Tim Van de Voorde 3,4, Wenbo Xu 1,2, Guiyun Zhou 1 and Yan Zhou 1

1 School of Resources and Environment, University of Electronic Science and Technology of , 2006 Xiyuan Avenue, West Hi-Tech Zone, Chengdu 611731, Sichuan, China; [email protected] (R.R.); [email protected] (W.X.); [email protected] (G.Z.); [email protected] (Y.Z.) 2 Center for Information Geoscience, University of Electronic Science and Technology of China, 2006 Xiyuan Avenue, West Hi-Tech Zone, Chengdu 611731, Sichuan, China 3 Department of Geography, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; [email protected] 4 Department of Geography, Ghent University, Krijgslaan 281, S8, 9000 Ghent, Belgium * Correspondence: [email protected]; Tel.: +86-28-6183-0279

Received: 5 February 2018; Accepted: 4 March 2018; Published: 12 March 2018

Abstract: Fast and automatic detection of airports from remote sensing images is useful for many military and civilian applications. In this paper, a fast automatic detection method is proposed to detect airports from remote sensing images based on convolutional neural networks using the Faster R-CNN algorithm. This method first applies a convolutional neural network to generate candidate airport regions. Based on the features extracted from these proposals, it then uses another convolutional neural network to perform airport detection. By taking the typical elongated linear geometric shape of airports into consideration, some specific improvements to the method are proposed. These approaches successfully improve the quality of positive samples and achieve a better accuracy in the final detection results. Experimental results on an airport dataset, Landsat 8 images, and a Gaofen-1 satellite scene demonstrate the effectiveness and efficiency of the proposed method.

Keywords: airport detection; convolutional neural network; region proposal network

1. Introduction Fast and automatic detection of airports from remote sensing images is useful in many military and civilian applications [1,2]. In recent years, airport detection has gained increased attention and has been a topic of interest in computer vision and remote sensing research. Many researchers have studied this problem and have reported valuable results in literature. Some previous works focused on using the characteristics of runways, which generally have obvious elongated parallel edges [3–7]. They used edge or line segment extraction methods to delineate the shape of airport runways. Some scholars used textural information or shape features for airport detection via classifiers, such as support vector machine (SVM) or Adaptive Boosting (AdaBoost) algorithm [8–10]. In remote sensing images, however, there could be ground objects such as roads or coastlines, which demonstrate linear shape features or textural features similar to runways. This will lower the performance of these algorithms and can lead to a relatively high false alarm rate. Tao et al. [1] proposed to implement the airport detection task based on a set of scale-invariant feature transform (SIFT) local features. Within a dual-scale and hierarchical architecture, they first used clustered SIFT keypoints and segmented region information to locate candidate regions that may contain an airport in the coarse scale. Then, these candidate regions were mapped to the original scale and texture information was used for airport detection with a SVM classifier. Similarly, Zhu et al. [11]

Remote Sens. 2018, 10, 443; doi:10.3390/rs10030443 www.mdpi.com/journal/remotesensing Remote Sens. 2018, 10, 443 2 of 20 used SIFT descriptors and SVM to detect airports, but using a two-way saliency technique to predict candidate regions. Tang et al. [2] proposed a two-step classification method for airport detection. They first used a SVM classifier on the line segments to extract candidate regions, and then used another SVM on the texture features. Budak et al. [12] also used a line segment detector (LSD) to locate candidate regions, but they employed SIFT features and Fisher vector representation to detect airports with a SVM classifier. More recently, Zhang et al. [13] proposed to detect airports from optical satellite images using deep convolutional neural networks, and achieved some good results. However, the computational load of their method is heavy. They first used a hand-designed line detection approach to generate region proposals (From this point on, we will use the terms ‘candidate regions’ and ‘region proposals’ alternatively without a meaningful differentiation because both are used interchangeably in the literature), and then used the AlexNet to identify airport in each proposal separately. A heavy computational load impedes this method’s practical applications when computational resources are limited, and fast detection is required. Deriving candidate regions from line segments is indeed rather complicated and requires careful hand-tuning of many parameters. For example, the widely used LSD algorithm generally takes about 1 s for a typical image of size 880 × 1600. Computing features from images is also heavy. For example, the SIFT algorithm generally takes about 1 s for a proposal, while computing texture features derived from the co-occurrence matrix generally takes more than 3 s. Current methods therefore cannot achieve an adequate real-time or near real-time detection performance. As far as we know, all the reported airport detection methods in the literature cannot perform the detection procedure in less than 1 s. The common weakness of the current methods therefore appears to be the heavy computational loads. Nevertheless, computational efficiency is important to most applications. In recent years, many methods were proposed for detecting ordinary objects in natural images by using convolutional neural networks (CNN). Girshick et al. [14] first used region-based convolutional neural networks (R-CNN) for object detection. In their method, selective search algorithm [15] is first used to generate region proposals. Then, each scale-normalized proposal is classified using the features that are extracted with a convolutional neural network. Because this method performs a forward pass in the convolutional neural network for each object proposal without sharing computation, it is very slow. He et al. [16] proposed a method to speed up the original R-CNN algorithm by introducing a spatial pyramid pooling layer and reusing the features generated at several image resolutions. This method avoids repeatedly computing the convolutional features. Girshick [17] proposed a region of interest (RoI) pooling layer to extract a fixed-length feature vector from the feature map at a single scale. Based on the work of Girshick [17], Ren et al. [18] proposed the Faster R-CNN algorithm, which shows that the selective search proposals can be replaced by the ones that are learned from a convolutional neural network. This method has demonstrated good object detection accuracy with fast computational speed. Recently, Redmon et al. [19,20] proposed two real-time object detection approaches. Although these two algorithms are very fast, it is observed that they are not good at detecting small objects and they struggle to generalize to objects in new or unusual aspect ratios [19,20]. These CNN-based object detection approaches have achieved some state-of-the-art results for ordinary objects in daily life scenes. However, for some special objects like airports in remote sensing images, specific techniques to improve detection performance are still needed. In this paper, we propose a fast automatic airport detection method that relies on convolutional neural networks, using the Faster R-CNN algorithm [18]. Based on a deep convolutional neural network architecture, this method first uses a region proposal network (RPN) to obtain many candidate proposals. The features extracted from the CNN’s convolutional feature map of these proposals are then used for airport detection. The CNN for airport detection and the RPN for region proposal share the same convolutional layers. This adds only a small computational load for the airport detection network, and allows for training the CNN and the RPN alternatively with the same training dataset. To handle the typical elongated linear geometric shape of airports with the Faster R-CNN algorithm, which is typically used to detect ordinary objects in daily life scenes, we propose to include additional Remote Sens. 2018, 10, 443 3 of 20

Remote Sens. 2018, 10, x FOR PEER REVIEW 3 of 20 scale and aspect ratios to increase the quality of positive samples. In addition, we use a constraint to use a constraint to sieve low quality positive samples. Experimental results on an airport dataset, sieveLandsat low quality 8 images, positive and a samples. Gaofen-1 Experimentalsatellite scene resultsdemonstrate on an the airport effectiveness dataset, Landsatand efficiency 8 images, of the and a Gaofen-1proposed satellite method. scene demonstrate the effectiveness and efficiency of the proposed method. TheThe remainder remainder of of this this paper paper is is organized organized asas follows.follows. In Section Section 2,2, we we first first introduce introduce the the neural neural networknetwork architecture architecture and and the the domain-specific domain-specific fine-tuningfine-tuning and data data argumentation argumentation methods methods of ofthe the proposedproposed approach. approach. We We then then presentpresent some specific specific improvement improvement techniques techniques for airport for airport detection. detection. In In SectionSection 33,, we first first test test the the proposed proposed method method on a onset of a setairport of airport images, images, and then and evaluate then its evaluate practical its practicalperformance performance on Landsat on Landsat 8 images 8 images and a and Gaofen-1 a Gaofen-1 satellite satellite scene. scene. Next, Next, the theresults results of ofthese these improvementsimprovements are are presented presented and and discussed discussed and, and, finally, finally, conclusionsconclusions are drawn drawn in in Section Section 4.4 .

2. Methodology2. Methodology In theIn the machine machine learning learning area, area, CNNs CNNs have have demonstrated demonstrated a much a much better better feature feature representation representation than traditionalthan traditional methods methods [21] and [21] they and have they shown have shown great potential great potential in many in visualmany visual applications. applications. A CNN A is formedCNN by is aformed stack ofby distincta stack of layers, distinct where layers, each wher successivee each successive layer uses layer the uses output the fromoutput the from previous the layerprevious as input. layer These as multipleinput. These layers multiple make the layers network make very the expressive network to very learn expressive relationships to betweenlearn inputsrelationships and outputs. between By usinginputs anand appropriate outputs. By lossusing function an appropriate on a set loss of labeledfunction training on a set data,of labeled which training data, which penalizes the deviation between the predicted and true labels, the weights of penalizes the deviation between the predicted and true labels, the weights of these layers can be these layers can be obtained by supervised learning. In the following, we first introduce the network obtained by supervised learning. In the following, we first introduce the network architecture of the architecture of the proposed method for airport detection, and then present some specific proposed method for airport detection, and then present some specific improvement techniques of the improvement techniques of the proposed method. The flowchart of the proposed algorithm is shown proposedin Figure method. 1. The flowchart of the proposed algorithm is shown in Figure1.

Figure 1. The airport detection network architecture is shown in the blue frame, and the region Figure 1. The airport detection network architecture is shown in the blue frame, and the region proposal proposal network architecture is shown in the green frame. network architecture is shown in the green frame.

Remote Sens. 2018, 10, 443 4 of 20

2.1. Convolutional Neural Network for Airport Detection The Visual Geometry Group network with 16 layers (VGG-16) [22] was used in our experiments. This network has demonstrated better performance than many other networks by using a deeper convolutional network architecture [17,18,22]. The network has 13 convolutional layers, where the filters have 3 × 3 receptive fields with stride 1. The convolutional process is defined as

H W D yi,j,d = ∑ ∑ ∑ fi0,j0,d0,d × xi0+i−1,j0+j−1,d0 + bd (1) i0=1 j0=1 d0=1 where xi,j,d and yi,j,d are the respective input and output neuron values at position (i, j) in the dth channel. f is the filter, and bd is the bias in the dth channel. Nonlinear rectified linear unit (ReLU)

f (x) = max(0, x) (2) where x is the input to a neuron, is used for each layer. Max-pooling yi,j,d = max xi+i0−1,j+j0−1,d (3) i0=1,2;j0=1,2 where xi,j,d and yi,j,d are the input and output neuron values at position (i, j) in the dth channel, respectively, is performed with stride 2, following two or three convolutional layers [21–23]. A RoI pooling layer and two fully connected layers were used to extract a fixed-length (4096-dimensional in the experiments) feature vector from the convolutional feature map for each proposal generated by the RPN, which will be introduced in the following section. The RoI pooling layer divides a region proposal into a fixed-size grid and then in each grid cell uses max-pooling on the values, which are in the corresponding sub-windows in the feature map. The extracted feature vector was used in a fully connected layer to produce the softmax probability estimated as airport and in another fully connected layer to produce the four values that encoded the refined bounding box position of the airport using regression. For the two fully connected layers, dropout regularization [24] was performed with probability 0.5 for individual nodes. More details about the network architecture can found in [17,22]. If the score of a proposal generated by the network is larger than a pre-specified threshold, an airport is claimed to be detected in the refined proposal.

2.2. Region Proposal Network RPN is a deep network that can produce proposals in an effective forward computational way with the feature map produced by the CNN described in the previous section. It first adds an additional convolutional layer on the top of the feature map obtained by that CNN to extract a fixed-length (512-dimensional in the experiments) feature vector at each convolutional map position. Then, the extracted feature vector is fed into two additional sibling fully connected layers. One outputs a set of rectangular positions relative to different scales and aspect ratios at that convolutional map position, and the other outputs the scores of these rectangular positions. ReLU is also used for the convolutional layer. Because the RPN shares the convolutional layers with the CNN for airport detection, the RPN adds only small computational load to the original CNN for airport detection. More details of the RPN architecture can be found in [18]. Note that the region proposals produced by the RPN generally overlap each other. Therefore, non-maximum suppression (NMS) is performed on the produced proposals based on their scores. A region proposal is rejected if it has an intersection-over-union (IoU) overlap, which is more than a specified threshold, with another region proposal that has higher score. IoU was computed as the intersection divided by the union of two sets

area(A ∩ B) IoU = (4) area(A ∪ B) Remote Sens. 2018, 10, 443 5 of 20 where A ∪ B and A ∩ B denote the union and intersection of two concerned sets, respectively, and area() denotes the area of a set. After NMS, the top-300 proposals are fed in the above CNN RoI pooling layer to produce the final airport detection.

2.3. Fine-Tuning and Argumentation In our collected airport dataset, which is introduced in the following section, only a limited number of airports were available. As the CNN requires a large amount of data for training in order to avoid overfitting, we used the pre-trained network weights of the VGG model, which was successfully trained with the widely used ImageNet Large Scale Visual Recognition Challenge (ILSVRC) dataset [18,21]. To adapt the pre-trained CNN to the airport detection task, we fine-tuned these network weights using the collected airport dataset. Due to limited memory, the shared convolutional layers were initialized by the pre-trained VGG model weights and only the conv3_1 (the first layer of the third stack of convolutional layers) and upper layers were fine-tuned. Because the RPN shares the convolutional layers with the CNN for airport detection, these two networks can be trained alternatively between fine-tuning for the region proposal task and then fine-tuning for the airport detection task. In our experiments, we used argumentation techniques such as horizontal flip, vertical flip, image rotation, and image translation to enlarge the training data. It was observed that this domain-specific fine-tuning allows learning good network weights for a high-capacity CNN for airport detection.

2.4. Improvement Techniques In the original Faster R-CNN algorithm [18], which was proposed for detecting ordinary objects, at each feature map position, nine region proposals were predicted at three scales of 1282, 2562, and 5122, and three aspect ratios of 1:1, 1:2, and 2:1. However, we found that there are many small airfields in remote sensing images of medium spatial resolution (e.g., 16 m), especially in America, Europe and on certain islands. In such cases, the smallest scale of 1282 would be too large to accurately locate these. Therefore, in our implementation, another scale of 642 was added. For example, in Figure2, there is a small airport (Matsu Peigan) located in the scene, and its bounding box is indicated by the blue rectangle. One of the positive samples, which were obtained by the original Faster R-CNN algorithm with the smallest scale of 1282, is indicated by the red rectangle in Figure2a. Although this positive sample includes the whole area of the airfield, it is too large and most of the pixels in it are other ground objects. This redundant information lowers the quality of the training samples. By using a smaller scale of 642, the positive sample, which is shown in Figure2b, is more accurate than the one shown in Figure2a. The runway, which always demonstrates an elongated geometric linear shape, is the most important feature for airport detection. We found that the aspect ratios of 1:1, 1:2, and 2:1, which are generally sufficient for detecting ordinary objects, cannot accurately locate those vertical or horizontal runways that have elongated bounding boxes. Therefore, two aspect ratios of 1:3 and 3:1 were added in our implementation. For example, in Figure3, there is a nearly horizontal airport (Matsu Nangan airport) and its bounding box, indicated by the blue rectangle, demonstrates an elongated geometric shape. One of the positive samples, which were obtained by the original Faster R-CNN algorithm with the aspect ratio of 1:2, is shown in the red rectangle in Figure3a. One can see that this positive sample only covers a part of the runway area. By using a bigger aspect ratio of 1:3, one can obtain a better positive sample (shown in red rectangle in Figure3b). In our experiments, we observed that the added scale and aspect ratios helped to increase both the quality of positive samples and the final accuracy of airport detection. Remote Sens. 2018, 10, 443 6 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 6 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 6 of 20

(a) (b) Figure 2. Positive samples at(a different) scales: (a) scale of 1282; (b) scale(b of) 642. The bounding box of Figure 2. Positive samples at different scales: (a) scale of 1282;(b) scale of 642. The bounding box of the theFigure airport 2. Positive is shown samples by the at blue different rectangle, scales: and (a )the scale positive of 128 2sample; (b) scale is shownof 642. byThe the bounding red rectangle. box of airport is shown by the blue rectangle, and the positive sample is shown by the red rectangle. [Image [Imagethe airport courtesy is shown of Google by the Earth]. blue rectangle, and the positive sample is shown by the red rectangle. courtesy of Google Earth]. [Image courtesy of Google Earth].

(a) (b) Figure 3. Positive samples(a) at different aspect ratios: (a) aspect ratio of 1:2; ((bb)) aspect ratio of 1:3. The boundingFigure 3. Positivebox of the samples airport at is different shown by aspect the blue ratios: rectangle, (a) aspect and ratio the ofpositive 1:2; (b )sample aspect isratio shown of 1:3. by Thethe Figure 3. Positive samples at different aspect ratios: (a) aspect ratio of 1:2; (b) aspect ratio of 1:3. redbounding rectangle. box [Image of the airportcourtesy is ofshown Google by Earth].the blue rectangle, and the positive sample is shown by the The bounding box of the airport is shown by the blue rectangle, and the positive sample is shown by red rectangle. [Image courtesy of Google Earth]. theIn red CNN rectangle. training [Image by back-propagation courtesy of Google and Earth]. stochastic gradient descent [25], the quality of the positiveIn CNN samples training is important by back-propagation to obtain good and netw stochaork weights,stic gradient which descent will affect [25], the the final quality detection of the accuracy.positiveIn CNN samples In training the originalis important by back-propagation Faster to R-CNNobtain good algorithm and netw stochasticork [18], weights, which gradient whichwas proposed descent will affect [ 25for the], ordinary the final quality detection object of the positivedetection,accuracy. samples Inpositive the is original important samples Faster were to obtain R-CNN labeled good algorithmto the network region [18], weights, that which has whichwasthe highestproposed will affectIoU for overlap theordinary final with detectionobject the accuracy.boundingdetection, In boxpositive the or original to samplesthose Faster regions were R-CNN thatlabeled have algorithmto an the Io Uregion ratio [18 ],thatoverlap which has higherthe was highest proposed than 0.7IoU with foroverlap the ordinary bounding with objectthe detection,box.bounding This positive works box or for to samples ordinarythose regions were objects labeledthat that have have to an the a Io relaU region ratiotively overlapthat small has aspect higher the ratio. highest than However, 0.7 IoU with overlap the in thisbounding withpaper the boundingwebox. want This box to works detect or to for those airports, ordinary regions which objects that have havethat elongate have an IoU a drela ratio lineartively overlap shapes. small higher aspectSimply thanratio. using 0.7 However, the with IoU the ratioin bounding this overlap paper box. with the bounding box for that purpose could generate positive samples that cover only a small Thiswe works want to for detect ordinary airports, objects which that have have elongate a relativelyd linear small shapes. aspect Simply ratio. using However, the IoU in ratio this overlap paper we fractionwith the of bounding the runway box area for in that the purposeground-truth. could Asgene therate runway positive is often samples the only that obvious cover only feature a small that want to detect airports, which have elongated linear shapes. Simply using the IoU ratio overlap with canfraction be used of the for runway airport areadetection, in the thisground-truth. could lead Asto low-qualitythe runway positiveis often thesamples. only obvious feature that the bounding box for that purpose could generate positive samples that cover only a small fraction of can beNominal used for positive airport samplesdetection, that this are could generated lead to fromlow-quality airfield positive images samples.in which the runway has a the runway area in the ground-truth. As the runway is often the only obvious feature that can be used diagonalNominal or anti-diagonal positive samples direction, that forare example,generated may from cover airfield only images a small in partwhich of thethat runway runway. has We a for airport detection, this could lead to low-quality positive samples. observeddiagonal orthat anti-diagonal such samples direction, contain sofor littleexample, valuable may information cover only thata small they part cannot of that have runway. a positive We effectobservedNominal on network that positive such training. samples samples In contain Figure that are4a,so littlethe generated Zhangjiakou valuable from information Ningyuan airfield imagesthat airport they (blue in cannot which rectangle have the runwayaindicates positive has a diagonalboundingeffect on network orbox) anti-diagonal has training. a runway In direction, withFigure an 4a, elongated for the example, Zhangjiakou linear mayshape Ningyuan cover in the only diagonalairport a small (blue direction. partrectangle ofThe that indicatesground- runway. Wetruthbounding observed of the box) that runway has such a samplesarearunway occupies with contain anonly elongated so a little small valuable linearfraction shape information of pixels in the in diagonal thatits bounding they direction. cannot box. haveThe One ground- aof positive the effectpositivetruth on of network samples,the runway training. obtained area Inoccupies using Figure the 4only a,rules the a of Zhangjiakousmall the originalfraction NingyuanFasterof pixels R-CNN in airport its algorithmbounding (blue rectangle(red box. rectangle Oneindicates of the in boundingFigurepositive 4a), box) samples, only has covers a obtained runway a very withusing small an the elongatedfraction rules of of thelinearthe originalrunway shape Fasterlocated in the R-CNN diagonal in the right algorithm direction. top corner. (red The rectangleMost ground-truth of the in of theinformationFigure runway 4a), only in area this covers occupies nominal a very onlypositive small a smallfraction sample fraction of is the ther runway ofefore pixels redundant. located in its in bounding theIn Figureright box.top 4b, corner. there One ofis Most thea smaller of positive the samples,airportinformation obtained(Yangzhou in this using Taizhou nominal the rulesairport). positive of the One sample original of the is positi ther Fastereforeve samples R-CNN redundant. that algorithm were In Figure obta (redined 4b, rectangle usingthere theis in a original Figuresmaller4 a), only airport covers (Yangzhou a very small Taizhou fraction airport). of the One runway of the locatedpositive in samples the right that top were corner. obtained Most using of the the information original

Remote Sens. 2018, 10, 443 7 of 20 in this nominal positive sample is therefore redundant. In Figure4b, there is a smaller airport

(YangzhouRemote Sens. Taizhou 2018, 10, x airport). FOR PEEROne REVIEW of the positive samples that were obtained using the original7 of Faster 20 R-CNN algorithm (red rectangle) only covers the terminal building and a very small fraction of the runway.Faster AsR-CNN the runway algorithm could (red be rectangle) the only detectableonly covers feature the terminal in this building airport, weand believe a very thatsmall this fraction positive sampleof the is runway. not a typical As the airport runway image. could be In fact,the only it is detect evenable very feature hard for in athis human airpor tot, we determine believe that whether this it is anpositive airport sample by looking is not ata typical only this airport sample. image. We In therefore fact, it is proposeeven very to hard add for a constraint a human to on determine the positive sampleswhether to it sieve is an theairport low by quality looking ones, at only i.e., this those sample. that We only therefore cover apropose few airport to add pixels a constraint such asonthe examplesthe positive shown samples in Figure to sieve4. Inthe our low implementation, quality ones, i.e., those if a positive that only sample cover a obtained few airport by pixels the original such Fasteras the R-CNN examples algorithm showncovers in Figure less 4. than In our 70% implementation, of ground-truth if runway a positive pixels, sample it will obtained be discarded by the in original Faster R-CNN algorithm covers less than 70% of ground-truth runway pixels, it will be the training process. Although this will decrease the number of positive samples for CNN training, discarded in the training process. Although this will decrease the number of positive samples for in our experiments, we observed that the final accuracy of airport detection did not decrease but CNN training, in our experiments, we observed that the final accuracy of airport detection did not increased instead. decrease but increased instead.

(a)(b)

Figure 4. Low quality positive samples: (a) Zhangjiakou Ningyuan airport; (b) Yangzhou Taizhou Figure 4. Low quality positive samples: (a) Zhangjiakou Ningyuan airport; (b) Yangzhou Taizhou airport. The bounding box of the airport is shown by the blue rectangle, and the low quality positive airport. The bounding box of the airport is shown by the blue rectangle, and the low quality positive sample is shown by the red rectangle. [Image courtesy of Google Earth].3. Results sample is shown by the red rectangle. [Image courtesy of Google Earth]. 3.1. Dataset and Computational Platform 3. Results We collected images from 343 airports around the world from Google Earth. These airports have 3.1.various Dataset sizes and Computationalfrom very large Platform ones such as Chicago O’Hare to very small airfields such as Matsu Peigan. For each airport, images were collected at a spatial resolution of 8 m or 16 m and at different anglesWe collected of view. imagesThe average from 343size airports of these around images the is worldabout from885 × Google 1613 pixels. Earth. 194 These airports airports were have variousrandomly sizes chosen from very as training large ones data, such which as Chicago include O’Hare377 images to very collected small at airfields these airports. such as Some Matsu of Peigan. the Fortraining each airport, airport images images wereare shown collected in Figure at a spatial 5. By using resolution the argumentation of 8 m or 16 mmethods and at differentmentioned angles in of view.Section The 2.3, averagean enlarged size training of these dataset images of is about about 28,000 885 × images1613 pixels.was obtained. 194 airports The rest were of randomlythe 149 chosenairports as trainingwere used data, for whichdetection include performance 377 images testing, collected which atinclude these 281 airports. images Some collected of the at trainingthese airportairports. images Another are shown 50 images, in Figure which5 .have By using no airports the argumentation but some similar methods ground mentioned objects, were in also Section used 2.3 , anfor enlarged testing. training Landsat dataset8 images of acquired about 28,000 in four images seasons, was and obtained. a Gaofen-1 The satell restite of scene the149 that airports covers the were usedBeijing-Tianjin-Hebei for detection performance area, China, testing, were which also taken include to test 281 the images practical collected performance at these of airports. the proposed Another 50 images,method. which have no airports but some similar ground objects, were also used for testing. Landsat 8 imagesAll acquired the experiments in four seasons, were implemented and a Gaofen-1 on a satellitepersonal scene computer that coverswith double the Beijing-Tianjin-Hebei 3.6-GHz Intel i7 area,cores, China, 8 GB were DDR4 also memory, taken toand test anthe NVIDIA practical GeForce performance GTX1060 ofgraphics the proposed card with method. 6 GB memory. The experimentsAll the experiments were implemented were implemented using Caffe [26]. on a personal computer with double 3.6-GHz Intel i7 cores, 8 GB DDR4 memory, and an NVIDIA GeForce GTX1060 graphics card with 6 GB memory. The experiments were implemented using Caffe [26].

Remote Sens. 2018, 10, 443 8 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 8 of 20

Figure 5. Some of the training airport images. [Image courtesy of Google Earth]. Figure 5. Some of the training airport images. [Image courtesy of Google Earth]. 3.2. Results on Test Images 3.2. Results on Test Images We qualitatively and quantitatively compared our proposed method with the original Faster R- CNNWe algorithm qualitatively and andthe airport quantitatively detection compared method prop ourosed proposed in [13] (this method meth withod will the be original denoted Faster by R-CNNLSD + algorithm AlexNet further and the in airport the text). detection We used method the default proposed parameter in [13 setting] (this methodin the code will (Available be denoted byonline: LSD + AlexNethttps://github.com/rbgirshick/py-faster-rcnn) further in the text). We used the default for both parameter the original setting Faster in R-CNN the code algorithm (Available online:and ourhttps://github.com/rbgirshick/py-faster-rcnn proposed method. The average IoU overlap ratio) for was both used the originalto assess Faster location R-CNN accuracy algorithm and andevaluate our proposed the performance. method. TheIn addi averagetion, the IoU commonly overlap ratioused wasaverage used precision to assess (AP) location [27] was accuracy adopted and evaluateas metric the for performance. evaluating detection In addition, performance. the commonly An obtained usedaverage bounding precision box was (AP)considered [27] was correct adopted if as metricit had an for IoU evaluating overlap with detection the ground performance. truth that An is obtainedhigher than bounding 50%. Otherwise box was it considered was considered correct if itincorrect. had an IoU Other overlap metrics with of precision, the ground recall, truth and that accuracy is higher were thanalso used 50%. to Otherwise evaluate the it results was considered [28,29]. The false alarm rate (FAR), which is defined as the ratio of the number of falsely detected airports to incorrect. Other metrics of precision, recall, and accuracy were also used to evaluate the results [28,29]. the number of testing images, was used as an additional metric for performance evaluation. The false alarm rate (FAR), which is defined as the ratio of the number of falsely detected airports to Figure 6 shows the detection results of the Tianshui Maijishan airport by the Faster R-CNN the number of testing images, was used as an additional metric for performance evaluation. algorithm (Figure 6a) and by the proposed method (Figure 6b). The Faster R-CNN algorithm wrongly Figure6 shows the detection results of the Tianshui Maijishan airport by the Faster R-CNN identified an elongated water body as airport while our proposed method did successfully detect the algorithmairport (Figurein the scene.6a) and Although by the proposed the LSD method+ AlexNet (Figure method6b). also The detected Faster R-CNN this airport algorithm in the wrongly scene identified(Figure 6c), an elongatedthe location water accuracy body was as clearly airport a little while lower our than proposed the resu methodlt shown did in successfullyFigure 6b. detect the airport in the scene. Although the LSD + AlexNet method also detected this airport in the scene (Figure6c), the location accuracy was clearly a little lower than the result shown in Figure6b.

Remote Sens. 2018, 10, 443 9 of 20

Remote Sens. 2018, 10, x FOR PEER REVIEW 9 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 9 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 9 of 20

(a) (b)(c)

Figure 6. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD Figure 6. True(a detection) results of different methods.(b)( (a) Faster R-CNN; (b)c proposed) method; + AlexNet.(a )[Image courtesy of Google Earth]. (b)(c) (c)Figure LSD + 6. AlexNet. True detection [Image results courtesy of different of Google methods. Earth]. (a) Faster R-CNN; (b) proposed method; (c) LSD Figure+ AlexNet.The 6. True Frankfurt-Hahn [Image detection courtesy results airport of Google of different(Figure Earth]. 7) methods. was detected (a) Faster by all R-CNN; the three (b algorithms,) proposed method;but the proposed (c) LSD The+ methodAlexNet. Frankfurt-Hahn (Figure [Image 7b) courtesy is airport more of accurate Google (Figure asEarth].7 part) was of th detectede airport byarea all was the not three detected algorithms, by the Faster but R-CNN the proposed algorithm (Figure 6a). The LSD + AlexNet method detected the three runways separately in this scene. methodThe (Figure Frankfurt-Hahn7b) is more airport accurate (Figure as part 7) was of the detected airport by area all the was three not algorithms, detected by but the the Faster proposed R-CNN methodTheThe (FigureFrankfurt-Hahn Meixian 7b) Changgangji is more airport accurate airport (Figure as (Figure part 7) was of 8) th detectedwase airport also bydetected area all thewas more three not accuratelydetected algorithms, by by the butthe Faster theproposed proposed R-CNN algorithmmethod (Figure than6a). by Thethe Faster LSD +R-CNN AlexNet algorithm. method Th detectede LSD + AlexNet the three method runways wrongly separately identified in thisan scene. methodalgorithm (Figure (Figure 7b) 6a). is more The LSD accurate + AlexNet as part method of the airportdetected area the was three not runways detected separately by the Faster in this R-CNN scene. The Meixianelongated Changgangji water body airport as airport (Figure although8) was part also of the detected airport area more wa accuratelys included in by the the bounding proposed box. method algorithmThe Meixian (Figure Changgangji 6a). The LSD airport + AlexNet (Figure method 8) was detect alsoed detected the three more runways accurately separately by the in this proposed scene. than byAll the the Faster three approaches R-CNN algorithm. successfully The detected LSD the + smaller AlexNet Fuyang method Xiguan wrongly airport (Figure identified 9). The an LSD elongated Themethod Meixian than Changgangjiby the Faster airport R-CNN (Figure algorithm. 8) was Th ealso LSD detected + AlexNet more method accurately wrongly by the identified proposed an water body+ AlexNet as airport method although only detected part the of therunway, airport while area both was the includedproposed method in the bounding and the Faster box. R-CNN All the three methodelongatedalgorithm than water by successfully bodythe Faster as airport detected R-CNN alth the algorithm.ough whole part airp ofortTh the earea LSD airport (including + AlexNet area the wa terminalsmethod included building). wrongly in the bounding identified box. an approaches successfully detected the smaller Fuyang Xiguan airport (Figure9). The LSD + AlexNet elongatedAll the three water approaches body as airportsuccessfully although detected part theof the smaller airport Fuyang area wa Xiguans included airport in (Figurethe bounding 9). The box.LSD methodAll+ AlexNet the onlythree method detected approaches only the detected successfully runway, the while runway, detected both while the the smaller proposed both the Fuyang proposed method Xiguan andmethod airport the Fasterand (Figure the R-CNN Faster 9). The R-CNN algorithm LSD successfully+algorithm AlexNet methodsuccessfully detected only the detected whole airportthe runway,whole area airp (includingwhileort area both (including the the proposed terminal the method building).terminal and building). the Faster R-CNN algorithm successfully detected the whole airport area (including the terminal building).

(a) (b)(c)

Figure 7. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD + AlexNet. [Image courtesy of Google Earth].

(a) (b)(c) (a) (b)(c) Figure 7. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD Figure 7. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; Figure+ AlexNet. 7. True [Image detection courtesy results of Googleof different Earth]. methods. (a) Faster R-CNN; (b) proposed method; (c) LSD (c) LSD + AlexNet. [Image courtesy of Google Earth]. + AlexNet. [Image courtesy of Google Earth].

(a) (b)(c)

Figure 8. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD + AlexNet. [Image courtesy of Google Earth].

(a) (b)(c) (a) (b)(c) Figure 8. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD Figure 8. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD Figure+ AlexNet. 8. True [Image detection courtesy results of Google of different Earth]. methods. (a) Faster R-CNN; (b) proposed method; (c) LSD + AlexNet. [Image courtesy of Google Earth]. + AlexNet. [Image courtesy of Google Earth].

Remote Sens. 2018, 10, x FOR PEER REVIEW 10 of 20

Provided with a scene without airport in it (Figure 10), both the Faster R-CNN algorithm and LSD + AlexNet method wrongly detected an area with an elongated linear feature similar to a runway as airport (show in red rectangles). Our proposed method, by contrast, successfully detected that there was no airport present in this image. In our experiments we found that most of the relatively large airports were successfully detected by the proposed method. Most of the errors occurred on small airfields that are difficult to distinguish from similar ground objects. This is because the runway is the only useful geometric feature for detecting small airfields, and in medium-resolution satellite images its elongated linear geometric shapeRemote is verySens. 2018 similar, 10, x toFOR some PEER other REVIEW ground objects like highways, long ditches, etc. There is no 10 airport, of 20 for example, in the scene shown in Figure 11, but all the three methods nevertheless falsely detected Provided with a scene without airport in it (Figure 10), both the Faster R-CNN algorithm and one. The detected area contains an elongated ground object that is visually similar to a runway. RemoteLSD Sens. + AlexNet2018, 10 ,method 443 wrongly detected an area with an elongated linear feature similar to a runway10 of 20 as airport (show in red rectangles). Our proposed method, by contrast, successfully detected that there was no airport present in this image. In our experiments we found that most of the relatively large airports were successfully detected by the proposed method. Most of the errors occurred on small airfields that are difficult to distinguish from similar ground objects. This is because the runway is the only useful geometric feature for detecting small airfields, and in medium-resolution satellite images its elongated linear geometric shape is very similar to some other ground objects like highways, long ditches, etc. There is no airport, for example, in the scene shown in Figure 11, but all the three methods nevertheless falsely detected one. The detected area contains an elongated ground object that is visually similar to a runway.

(a) (b) (c)

Figure 9. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD Figure 9. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; + AlexNet. [Image courtesy of Google Earth]. (c) LSD + AlexNet. [Image courtesy of Google Earth].

Provided with a scene without airport in it (Figure 10), both the Faster R-CNN algorithm and

LSD + AlexNet method(a) wrongly detected an area( withb)( an elongated linear featurec similar) to a runway as airport (show in red rectangles). Our proposed method, by contrast, successfully detected that there Figure 9. True detection results of different methods. (a) Faster R-CNN; (b) proposed method; (c) LSD was no airport present in this image. + AlexNet. [Image courtesy of Google Earth].

(a) (b)

Figure 10. False detection result in an image with no airport in it. (a) Faster R-CNN; (b) LSD + AlexNet. [Image courtesy of Google Earth].

(a) (b)

Figure 10. False detection result in an image with no airport in it. (a) Faster R-CNN; (b) LSD + AlexNet. Figure 10. False detection result in an image with no airport in it. (a) Faster R-CNN; (b) LSD + AlexNet. [Image courtesy of Google Earth]. [Image courtesy of Google Earth].

In our experiments we found that most of the relatively large airports were successfully detected by the proposed method. Most of the errors occurred on small airfields that are difficult to distinguish from similar ground objects. This is because the runway is the only useful geometric feature for detecting small airfields, and in medium-resolution satellite images its elongated linear geometric shape is very similar to some other ground objects like highways, long ditches, etc. There is no airport, for example, in the scene shown in Figure 11, but all the three methods nevertheless falsely detected one. The detected area contains an elongated ground object that is visually similar to a runway.

Remote Sens. 2018, 10, 443 11 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 11 of 20

Remote Sens. 2018, 10, x FOR PEER REVIEW 11 of 20

(a) (b)(c)

Figure 11. False detection result in an image with no airport in it. (a) Faster R-CNN; (b) proposed Figure 11. False detection result in an image with no airport in it. (a) Faster R-CNN; (b) proposed method; (c) LSD + AlexNet. [Image courtesy of Google Earth]. method; (c) LSD(a) + AlexNet. [Image courtesy of Google(b)( Earth]. c) WeFigure carried 11. False out detectionthree experiments result in an with image different with no randomairport in training it. (a) Faster sets R-CNN; and reported (b) proposed the mean performanceWemethod; carried (ofc out) theLSD three proposed + AlexNet. experiments method [Image courtesy in with Tabledifferent of 1. Google After randomEarth].adding the training additional sets scale and and reported aspect the ratios mean performance(denoted as of T1 the in proposedTable 1), the method detection in Tableperformance1. After increased. adding the In additionalparticular, scalethe location and aspect accuracy ratios and theWe precision carried outof the three proposed experiments method with are differentbetter than random the result training obtained sets by and the reported original theFaster mean R- (denoted as T1 in Table1), the detection performance increased. In particular, the location accuracy CNNperformance algorithm. of theAfter proposed sieving lowmethod quality in Table positive 1. Af satermples adding (denoted the additional by T2 in the scale table), and the aspect detection ratios and the precision of the proposed method are better than the result obtained by the original Faster performance(denoted as T1increased in Table further. 1), the deThistection demonstrates performance that increased.our specific In improvement particular, the techniques, location accuracy which R-CNN algorithm. After sieving low quality positive samples (denoted by T2 in the table), the wereand the proposed precision to of handle the proposed the elongated method areairport better runway than the shapes, result obtainedeffectively by theimproved original detection Faster R- detectionperformance.CNN algorithm. performance The After final increased recallsieving value low further. (0.837)quality This of positive the demonstrates prop saosedmples method (denoted that our is slightly specificby T2 in less improvementthe than table), the the recall detection techniques, value which(0.838)performance were of proposedthe increasedoriginal to Faster handle further. R- the CNNThis elongated demonstrates algorithm. airport This that runway means our specific that shapes, th improvemente former effectively is only techniques, improved slightly detection whichmore performance.conservativewere proposed Thethan to final the handle latter, recall andthe value elongatedthat (0.837)the proposed airport of the method proposedrunway can shapes, methodachieve effectively better is slightly precision improved less and than accuracydetection the recall valuewithperformance. (0.838) almost of identical The the final original recall. recall FasterThe value Receiver R-CNN(0.837) Operating of algorithm. the prop Characteristicsosed This method means is (ROC) slightly that graphs the less former than and the the is recall onlyprecision- value slightly morerecall(0.838) conservative curves of the oforiginal thanthe results theFaster latter, (FiguresR-CNN and algorithm. that12 and the proposed13) This also means demonstrate method that th cane formerthat achieve the is proposed betteronly slightly precision method more and accuracyoutperformsconservative with almost thethan original the identical latter, Faster and recall. R-CNN that Thethe algorithm. proposed Receiver methodThe Operating proposed can Characteristicsachieve method better can precision achieve (ROC) graphsaand lower accuracy andfalsethe precision-recallpositivewith almost rate identicalwith curves the of recall.same the resultstrueThe Receiverpositive (Figures Operatingrate 12 an andd also 13Characteristics) has also a demonstratehigher (ROC) precision graphs that with the and proposedthe the same precision- recall method recall curves of the results (Figures 12 and 13) also demonstrate that the proposed method outperformsvalue. The theproposed original method Faster R-CNNand the algorithm.original Faster The proposedR-CNN algorithm, method can which achieve both a generate lower false outperforms the original Faster R-CNN algorithm. The proposed method can achieve a lower false positiveproposals rate withbased the on same the RPN, true positive outperformed rate and the also LSD has + a AlexNet higher precision method. withOn our the samecomputational recall value. positive rate with the same true positive rate and also has a higher precision with the same recall Theplatform, proposed it methodtook about and 0.16 the s original to process Faster an imag R-CNNe for algorithm, both our proposed which both method generate and proposalsthe Faster basedR- value. The proposed method and the original Faster R-CNN algorithm, which both generate onCNN the RPN, algorithm, outperformed and about the 3.23 LSD s for + AlexNetthe LSD + method. AlexNet On method. our computational platform, it took about proposals based on the RPN, outperformed the LSD + AlexNet method. On our computational 0.16 s to process an image for both our proposed method and the Faster R-CNN algorithm, and about platform,Table it1. tookNumerical about performance.0.16 s to process T1 denotes an imag addinge for bothadditional our proposed scale and methodaspect ratios, and theand Faster T2 R- 3.23 s for the LSD + AlexNet method. CNNdenotes algorithm, sieving and low about quality 3.23 positive s for thesamples. LSD “Faster+ AlexNet R-CNN+T1+T2” method. is the proposed method. Table 1. Numerical performance. T1 denotes adding additional scale and aspect ratios, and T2 denotes Table 1.Algorithm Numerical performance.Average T1 denotes IoU adding AP Recalladditional Precision scale and aspect Accuracy ratios, FARand T2 sieving low quality positive samples. “Faster R-CNN+T1+T2” is the proposed method. denotesFaster sieving R-CNN low quality positive 0.664 samples. “Faster 0.794 R-CNN+T1+T2”0.838 0.778 is the proposed 0.835 method. 0.090 Faster R-CNN + T1 0.708 0.795 0.833 0.826 0.854 0.069 FasterAlgorithm R-CNNAlgorithm + T1 + T2 Average Average 0.715 IoU IoU AP0.800 AP Recall Recall0.837 Precision 0.841 Accuracy 0.859 0.066FAR FAR FasterLSDFaster R-CNN+ AlexNet R-CNN 0.664 0.366 0.664 0.794 0.557 0.794 0.838 0.6550.838 0.597 0.778 0.6370.835 0.835 0.2030.090 0.090 FasterFaster R-CNN R-CNN + T1 + T1 0.708 0.708 0.795 0.795 0.833 0.833 0.826 0.854 0.854 0.069 0.069 FasterFaster R-CNN R-CNN + T1 + +T1 T2 + T2 0.715 0.715 0.8000.800 0.8370.837 0.841 0.859 0.859 0.066 0.066 LSDLSD + AlexNet + AlexNet 0.366 0.366 0.557 0.557 0.655 0.655 0.597 0.637 0.637 0.203 0.203

Figure 12. ROC graphs.

Figure 12. ROC graphs. Figure 12. ROC graphs.

Remote Sens. 2018, 10, 443 12 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 12 of 20

Remote Sens. 2018, 10, x FOR PEER REVIEW 12 of 20

Figure 13. Precision-recall curves. Figure 13. Precision-recall curves. Figure 13. Precision-recall curves. We also tested the performance of the proposed method with a different number of training We also tested the performance of the proposed method with a different number of training airports.We alsoFigure tested 14 indicates the performance that the performanceof the proposed increased method when with more a different airport number images wereof training used. airports. Figure 14 indicates that the performance increased when more airport images were used. airports.Using more Figure airport 14 indicatesimages for that training the performance will therefore increased be beneficial when for more further airport increasing images the were detection used. UsingUsingperformance. more more airport airport However, images images due for for to training training the difficulty willwill therefore of collecting be be beneficial beneficial many airport for for further further images increasing increasing from remote the the detection sensing detection performance.performance.scenes, we consider However, However, this due dueto be to to part the the difficulty ofdifficulty our future ofof collectingcowork.llecting many many airport airport images images from from remote remote sensing sensing scenes,scenes, we we consider consider this this to to be be part part of of our our futurefuture work.work.

Figure 14. Performance of the proposed method with different number of training images. Figure 14. Performance of the proposed method with different number of training images. 3.3. ResultsFigure on 14.LandsatPerformance 8 Images of the proposed method with different number of training images. 3.3. ResultsWe also on evaluatedLandsat 8 Imageshow the proposed method performed on airport appearances in different 3.3. Results on Landsat 8 Images seasons.We alsoThe evaluatedCNN model how weights,the proposed which method were obtainedperformed using on airport the airport appearances dataset in described different seasons.previously,We also The evaluatedwere CNN also model used how for theweights, this proposed experiment. which method were Figu reobtained performed 15 shows using the on detection airportthe airport appearancesresults dataset of the describedproposed in different seasons.previously,method The on CNN Landsatwere modelalso 8 usedimages weights, for (inthis whichtrue-color experiment. were composite obtained Figure 15 of using shows8-bit thequantization) the airport detection dataset ofresults the described Jiagedaqi of the proposed previously, airport weremethodin alsofour usedonseasons. Landsat for thisThese 8 images experiment. images, (in true-colorcollected Figure 15compositein showsdifferent theof seasons,8-bit detection quantization) were results pansharpened of of the the Jiagedaqi proposed to a airportspatial method oninresolution Landsat four seasons. 8 of images 15 m These and (in show true-color images, differing collected composite background in ofdiff 8-biterent colors. quantization) seasons, In winter, were most of pansharpened the of the Jiagedaqi background airportto a objectsspatial in four seasons.resolutionare covered These of with15 images, m andsnow collectedshow and differing the in cleaned different background runway seasons, colors.still were demonstrates In winter, pansharpened most the of typical the to abackground spatial elongated resolution objects linear of geometric shape. Our proposed method successfully detected this airport in all seasons, although 15 mare and covered show with differing snow background and the cleaned colors. runway In winter, still mostdemonstrates of the background the typical objectselongated are linear covered visually the location accuracy in winter is not as high as in the other three seasons. The results of the withgeometric snow and shape. the cleanedOur proposed runway method still demonstrates successfully detected the typical this elongatedairport in linearall seasons, geometric although shape. original Faster R-CNN algorithm are shown in Figure 16. Although the detection results in spring Ourvisually proposed the location method accuracy successfully in winter detected is not this as high airport as in in the all seasons,other three although seasons. visually The results the of location the (Figure 16a), summer (Figure 16b) and autumn (Figure 16c) are similar to the results shown in Figure accuracyoriginal in Faster winter R-CNN is not asalgorithm high as inare the shown other in three Figu seasons.re 16. Although The results the ofdetection the original results Faster in spring R-CNN 15, visually the original Faster R-CNN algorithm has a lower location accuracy on the winter image algorithm(Figure 16a), are shownsummer in (Figure Figure 16b) 16. and Although autumn the(Figure detection 16c) are results similar in to spring the results (Figure shown 16 a),in Figure summer 15,(Figure visually 16d) the than original our proposed Faster R-CNN method. algorithm The detect has aion lower results location on Landsat accuracy 8 onimages the winter in different image (Figure 16b) and autumn (Figure 16c) are similar to the results shown in Figure 15, visually the original (Figureseasons 16d)demonstrates than our theproposed generalization method. capabilitThe detecty ofion the results CNN-based on Landsat airport 8 imagesdetection in differentmethods Faster R-CNN algorithm has a lower location accuracy on the winter image (Figure 16d) than our seasonswhen applied demonstrates to images the that generalization are recorded bycapabilit other ysensors of the and CNN-based in other seasons. airport detection methods proposed method. The detection results on Landsat 8 images in different seasons demonstrates the when applied to images that are recorded by other sensors and in other seasons. generalization capability of the CNN-based airport detection methods when applied to images that are recorded by other sensors and in other seasons.

Remote Sens. 2018, 10, 443 13 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 13 of 20

(a) (b)

(c) (d)

Figure 15. Detection results of the proposed method in different seasons at Jiagedaqi airport. (a) 15 Figure 15. Detection results of the proposed method in different seasons at Jiagedaqi airport. April 2014; (b) 14 June 2015; (c) 20 September 2016; (d) 16 December 2016. (a) 15 April 2014; (b) 14 June 2015; (c) 20 September 2016; (d) 16 December 2016.

Remote Sens. 2018, 10, 443 14 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 20

(a) (b)

(c) (d)

Figure 16. Detection results of the original Faster R-CNNalgorithm in different seasons at Jiagedaqi Figure 16. Detection results of the original Faster R-CNNalgorithm in different seasons at Jiagedaqi airport. (a) 15 April 2014; (b) 14 June 2015; (c) 20 September 2016; (d) 16 December 2016. airport. (a) 15 April 2014; (b) 14 June 2015; (c) 20 September 2016; (d) 16 December 2016.

Remote Sens. 2018, 10, 443 15 of 20

Remote Sens. 2018, 10, x FOR PEER REVIEW 15 of 20 3.4. Results on a Gaofen-1 Scene 3.4.A Results Gaofen-1 on a satellite Gaofen-1 scene Scene (Figure 17), which covers the Beijing-Tianjin-Hebei area in China, was also testedA Gaofen-1 in our experiments. satellite scene The (Figur sizee of17), the which scene covers is 16,164 the Beijing-Ti× 16,427anjin-Hebei pixels in 16 area m spatial in China, resolution, was andalso it wastested recorded in our experiments. on 2 November The size 2014. of the It scene is awinter is 16,164 scene × 16,427 with pixels vegetation in 16 m spatial in the resolution, leaf-off state. Theand grass it was fields recorded next to on the 2 November airport runways 2014. It are is a therefore winter scene less with abundant. vegetation As the in the multispectral leaf-off state. Gaofen-1 The datagrass has fields a radiometric next to the resolution airport runways of 10 bits are whiletherefore the less CNN abundant. used for As airport the multispectral detection wasGaofen-1 trained withdata 8-bit has images, a radiometric we pre-processed resolution of the10 bits scene whil bye convertingthe CNN used it to for 8-bit airport quantization detection andwas wetrained used a true-colorwith 8-bit composite images, we in pre-processed our experiment. the scene Of the by 10 co airportsnverting init theto 8-bit scene, quantization the Beijing and Capital we used airport a (Figuretrue-color 18a) andcomposite the Tianjing in our Binhaiexperiment. airport Of (Figurethe 10 ai 18rportsb) are in two the largescene, international the Beijing Capital civilian airport airports (Figure 18a) and the Tianjing Binhai airport (Figure 18b) are two large international civilian airports while the others are mid-sized or small-sized civilian airports or military airports. Because the image while the others are mid-sized or small-sized civilian airports or military airports. Because the image is too large to be processed in its entirety due to memory limitations, we first divided it into several is too large to be processed in its entirety due to memory limitations, we first divided it into several small sub-images of size 600 × 1000. We maintained an overlap distance of 300 pixels between two small sub-images of size 600 × 1000. We maintained an overlap distance of 300 pixels between two neighboring sub-images, which is about the length of the runway of the Beijing Capital airport to neighboring sub-images, which is about the length of the runway of the Beijing Capital airport to avoidavoid splitting splitting an an airport airport in in two two parts parts over over two two sub-images. sub-images. Airport Airport detection detection waswas thenthen performedperformed on eachon ofeach these of these sub-images. sub-images. An NMSAn NMS was was performed performe ond on the the merged merged result result that that was was combined combined from from the detectionthe detection results results of the sub-images.of the sub-images. If the scoreIf the of score a proposal of a proposal is larger is than larger 0.9, than an airport0.9, an isairport claimed is to beclaimed detected. to be detected.

Figure 17.A scene of Gaofen-1 satellite that covers the Beijing area (true color composite). Figure 17. A scene of Gaofen-1 satellite that covers the Beijing area (true color composite). Our proposed method successfully detected almost all airports in about 95 s by using the pre- trainedOur proposedCNN model, method which successfully was trained detectedwith the ai almostrport dataset all airports introduced in about in Section 95 s by 3.1. using Nine the pre-trainedairports were CNN successfully model, which detected was in trained the scen withe by theapplying airport the dataset proposed introduced method, and in Sectionno false 3.1. Ninedetection airports was were found successfully (Figure 18). detected The Faster in R-CNN the scene algorithm, by applying on the the other proposed hand, made method, two false and no falsepositive detection detections was found (shown (Figure in Figure 18). 19j,k). The FasterThis demonstrates R-CNN algorithm, that, compared on the otherto the hand,Faster madeR-CNN two falsealgorithm, positive our detections proposed (shown method inis better Figure in 19 learnij,k).ng This feature demonstrates representations that, from compared the same to training the Faster R-CNNdataset, algorithm, which was our obtained proposed from method Google is better Earth in remote learning sensing feature images. representations It also has from a thebetter same generalization capability when applied to other sensors.

Remote Sens. 2018, 10, 443 16 of 20 training dataset, which was obtained from Google Earth remote sensing images. It also has a better generalizationRemote Sens. 2018 capability, 10, x FOR PEER when REVIEW applied to other sensors. 16 of 20 There is an airport in the Gaofen-1 scene that was not successfully detected by both methods There is an airport in the Gaofen-1 scene that was not successfully detected by both methods (shown in Figure 20). This airport has two runways that are separated far from each other and there is (shown in Figure 20). This airport has two runways that are separated far from each other and there no obvious terminal building. Because Gaofen-1 has different spectral characteristics than the remote is no obvious terminal building. Because Gaofen-1 has different spectral characteristics than the sensing images from Google Earth, we expect that using specific Gaofen-1 images for CNN training remote sensing images from Google Earth, we expect that using specific Gaofen-1 images for CNN willtraining lead to will a better lead to detection a better performance.detection performance.

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 18. Detection results in the Gaofen-1 satellite scene by the proposed method. (a)–(i) Nine true Figure 18. Detection results in the Gaofen-1 satellite scene by the proposed method. (a–i) Nine true detection results. All images have been zoomed to the same image height for a clear presentation. detection results. All images have been zoomed to the same image height for a clear presentation.

Remote Sens. 2018, 10, 443 17 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 17 of 20

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

(j) (k)

Figure 19. Detection results in the Gaofen-1 satellite scene by the Faster R-CNN algorithm. (a)–(i) Nine Figure 19. Detection results in the Gaofen-1 satellite scene by the Faster R-CNN algorithm. (a–i) Nine true detection results. (j), (k) Two false detection results. All images have been zoomed to the same trueimage detection height results.for a clear (j, kpresentation.) Two false detection results. All images have been zoomed to the same image height for a clear presentation.

Remote Sens. 2018, 10, 443 18 of 20 Remote Sens. 2018, 10, x FOR PEER REVIEW 18 of 20

FigureFigure 20. 20.The The airportairport thatthat was not detected detected by by both both methods. methods.

4. Conclusions 4. Conclusions In this paper, we proposed a fast and automatic airport detection method using convolutional neuralIn this networks. paper, weBased proposed on the Faster a fast andR-CNN automatic algorithm, airport we forwarded detection some method specific using improvement convolutional neuraltechniques networks. to handle Based the on typical the Faster elongated R-CNN linear algorithm, geometric we shape forwarded of an airport. some specificBy using improvement additional techniquesscale and to aspect handle ratios the to typical increase elongated the quality linear of positive geometric samples, shape and of anby airport.using a constraint By using additionalto sieve scalelow and quality aspect positive ratios samples, to increase a be thetter quality airport of detection positive performance samples, and wa bys obtained. using a constraint Note that to we sieve used low qualitythe same positive parameter samples, setting a better for airport both our detection method performance and the original was obtained. Faster R-CNN Notethat algorithm we used to the samedemonstrate parameter that setting the for increase both our in method performance and the really original benefited Faster R-CNN from algorithmthe proposed to demonstrate specific thatimprovement the increase techniques. in performance Compared really to benefited the reported from re thesults proposed on airport specific detection improvement in literature, techniques. this is Comparedto our best to knowledge the reported the results fastest on automatic airport detectionairport detection in literature, method this that is tocan our achieve best knowledgestate-of-the- the fastestart performance. automatic airport Compared detection to methodother airport that candetection achieve algorithms, state-of-the-art the proposed performance. method Compared needs to otherrelatively airport little detection pre-processing, algorithms, and the its proposed independence method from needs the difficult relatively effort little of pre-processing,handcrafted feature and its independencedesign is a major from advantage the difficult over effort many of traditional handcrafted methods. feature design is a major advantage over many traditionalOur methods.experimental results demonstrated that although the Faster R-CNN algorithm can obtain good results for ordinary objects, a better detection accuracy can be achieved by using dedicated Our experimental results demonstrated that although the Faster R-CNN algorithm can obtain improvement techniques for domain specific objects (like the elongated linear airport). The CNN good results for ordinary objects, a better detection accuracy can be achieved by using dedicated model was trained by a limited number of images collected from Google Earth, and our results on improvement techniques for domain specific objects (like the elongated linear airport). The CNN several Landsat 8 images acquired in four seasons and a Gaofen-1 scene demonstrated its modelgeneralization was trained capability by a limited as it was number successfully of images applied collected to images from Googlerecorded Earth, by other and sensors. our results We on severalbelieve Landsat that by 8 including images acquired more airport in four images, seasons such and as a images Gaofen-1 from scene other demonstrated sensors and images its generalization of high capabilityspatial resolution, as it was further successfully improvement applied on to the images proposed recorded airport by detection other sensors. method Wecan believebe expected. that by includingThis is a more topic airport of our future images, research such as work. images from other sensors and images of high spatial resolution, furtherIn improvement our experiments, on the the proposed Faster R-CNN airport algorithm detection was method used can due be to expected.its state-of-the-art This is a detection topic of our futureaccuracy research and work.acceptable computational speed. Other CNN-based object detection algorithms, such asIn YOLO our experiments,[19] and YOLO9000 the Faster [20], R-CNN are also algorithmvaluable to was be tested. used due It is to believed its state-of-the-art that, the proposed detection accuracyspecific and improvement acceptable techniques computational for airport speed. detection Other CNN-basedcan also be applied object detectionto other approaches algorithms, that such as YOLOare originally [19] and proposed YOLO9000 for detectin [20], areg ordinary also valuable objects in to daily be tested. life scenes, It is and believed this is that,under the our proposed future specificresearch improvement work. techniques for airport detection can also be applied to other approaches that are originally proposed for detecting ordinary objects in daily life scenes, and this is under our future Acknowledgments: This work was supported in part by the National Science and Technology Major Project of research work. China under Grants 30-Y20A07-9003-17/18 and 30-Y20A04-9001-17/18, in part by the National Natural Science Foundation of China under Grants 41671427 and 41371398, and in part by the National Key R & D Program of Acknowledgments: This work was supported in part by the National Science and Technology Major Project of ChinaChina under under Grants Grant 30-Y20A07-9003-17/18 2016YFB0502300. and 30-Y20A04-9001-17/18, in part by the National Natural Science Foundation of China under Grants 41671427 and 41371398, and in part by the National Key R & D Program of China under Grant 2016YFB0502300.

Remote Sens. 2018, 10, 443 19 of 20

Author Contributions: Fen Chen, Ruilong Ren and Wenbo Xu conceived and designed the experiments; Fen Chen and Ruilong Ren performed the experiments; Guiyun Zhou and Yan Zhou contributed the materials and computing resources; Fen Chen, Ruilong Ren and Tim Van de Voorde analyzed the data and wrote the paper. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Tao, C.; Tan, Y.; Tian, J. Airport Detection from Large IKONOS Images Using Clustered SIFT Keypoints and Region Information. IEEE Geosci. Remote Sens. Lett. 2011, 8, 128–132. [CrossRef] 2. Tang, G.; Xiao, Z.; Liu, Q.; Liu, H. A Novel Airport Detection Method via Line Segment Classification and Texture Classification. IEEE Geosci. Remote Sens. Lett. 2015, 12, 2408–2412. [CrossRef] 3. Pi, Y.; Fan, L.; Yang, X. Airport Detection and Runway Recognition in SAR Images. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Toulouse, France, 21–25 July 2003; pp. 4007–4009. 4. Huertas, A.; Cole, W.; Nevatia, R. Detect Runways in Complex Airport Scenes. Comput. Vis., Graph. Image Process. 1990, 51, 107–145. [CrossRef] 5. Sun, K.; Li, D.; Chen, Y.; Sui, H. Edge-preserve Image Smoothing Algorithm Based on Convexity Model and Its Application in The Airport Extraction. In Proceedings of the SPIE Remotely Sensed Data Information, Nanjing, China, 26 July 2007; pp. 67520M-1–67520M-11. 6. Chen, Y.; Sun, K.; Zhang, J. Automatic Recognition of Airport in Remote Sensing Images Based on Improved Methods. In Proceedings of the SPIE International Symposium on Multispectral Image Processing and Pattern Recognition: Automatic Target Recognition and Image Analysis; and Multispectral Image Acquisition, Wuhan, China, 15 November 2007; pp. 678645-1–678645-8. 7. Liu, D.; He, L.; Carin, L. Airport Detection in Large Aerial Optical Imagery. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, Canada, 17–21 May 2004; pp. 761–764. 8. Qu, Y.; Li, C.; Zheng, N. Airport Detection Base on Support Vector Machine from A Single Image. In Proceedings of the IEEE International Conference on Information, Communications and Signal Processing, Bangkok, Thailand, 6–9 December 2005; pp. 546–549. 9. Bhagavathy, S.; Manjunath, B.S. Modeling and Detection of Geospatial Objects Using Texture Motifs. IEEE Trans. Geosci. Remote Sens. 2006, 44, 3706–3715. [CrossRef] 10. Aytekin, O.; Zongur, U.; Halici, U. Texture-based Airport Runway Detection. IEEE Geosci. Remote Sens. Lett. 2013, 10, 471–475. [CrossRef] 11. Zhu, D.; Wang, B.; Zhang, L. Airport Target Detection in Remote Sensing Images: A New Method Based on Two-Way Saliency. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1096–1100. 12. Budak, Ü.; Halici, U.; ¸Sengür, A.; Karabatak, M.; Xiao, Y. Efficient Airport Detection Using Line Segment Detector and Fisher Vector Representation. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1079–1083. [CrossRef] 13. Zhang, P.; Niu, X.; Dou, Y.; Xia, F. Airport Detection on Optical Satellite Images Using Deep Convolutional Neural Networks. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1183–1187. [CrossRef] 14. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. 15. Uijlings, J.; Van de Sande, K.; Gevers, T.; Smeulders, A. Selective Search for Object Recognition. Int. J. Comput. Vis. 2013, 104, 154–171. [CrossRef] 16. He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 346–361. 17. Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December 2015; pp. 1440–1448. 18. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, Canada, 7–12 December 2015; pp. 91–99. Remote Sens. 2018, 10, 443 20 of 20

19. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 27–30 June 2016; pp. 580–587. 20. Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. 21. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. 22. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, 1409.1556v6. 23. Nair, V.; Hinton, G.E. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the International Conference on Machine Learning, Haifa, Israel, 21–24 June 2010; pp. 807–814. 24. Srivastava, N.; Hinton, G.; Krizhevsky, A. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. 25. LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R. E.; Hubbard, W.; Jackel, L.D. Back Propagation Applied to Handwritten Zip Code Recognition. Neural comput. 1989, 1, 541–551. [CrossRef] 26. Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Guadarrama, S.; Darrell, T. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv, 2014. 27. Everingham, M.; Gool, L.V.; Williams, C.K.I.; Winn, J.; Zisserman, A. The PASCAL Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [CrossRef] 28. Fawcett, T. Roc Graphs: Notes and Practical Considerations for Researchers; HP Laboratories: Palo Alto, CA, USA, 2006. 29. Senthilnath, J.; Sindhu, S.; Omkar, S.N. GPU-based Normalized Cuts for Road Extraction Using Satellite Imagery. J. Earth Syst. Sci. 2014, 123, 1759–1769. [CrossRef]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).