remote sensing

Article Automatic Identification and Dynamic Monitoring of Open-Pit Mines Based on Improved Mask R-CNN and Transfer Learning

Chunsheng Wang, Lili Chang *, Lingran Zhao and Ruiqing Niu

Institute of Geophysics & Geomatics, University of Geosciences, 430074, China; [email protected] (C.W.); [email protected] (L.Z.); [email protected] (R.N.) * Correspondence: [email protected]; Tel.: +86-132-9795-5213

 Academic Editor: Jonathan C-W Chan  Received: 3 August 2020; Accepted: 14 October 2020; Published: 22 October 2020

Abstract: As the ecological problems caused by mine development become increasingly prominent, the conflict between mining activity and environmental protection is gradually intensifying. There is an urgent problem regarding how to effectively monitor mineral exploitation activities. In order to automatic identify and dynamically monitor open-pit mines of Province, an open-pit mine extraction model based on Improved Mask R-CNN (Region Convolutional Neural Network) and Transfer learning (IMRT) is proposed, a set of multi-source open-pit mine sample databases consisting of Gaofen-1, Gaofen-2 and Google Earth satellite images with a resolution of two meters is constructed, and an automatic batch production process of open-pit mine targets is designed. In this paper, pixel-based evaluation indexes and object-based evaluation indexes are used to compare the recognition effect of IMRT, faster R-CNN, Maximum Likelihood (MLE) and Support Vector Machine (SVM). The IMRT model has the best performance in Pixel Accuracy (PA), Kappa and MissingAlarm, with values of 0.9718, 0.8251 and 0.0862, respectively, which shows that the IMRT model has a better effect on open-pit mine automatic identification, and the results are also used as evaluation units of the environmental damages of the mines. The evaluation results show that level I (serious) land occupation and destruction of key mining areas account for 34.62%, and 36.2% of topographical landscape damage approached level I. This study has great practical significance in terms of realizing the coordinated development of mines and ecological environments.

Keywords: ecological problems; Improved Mask R-CNN; transfer learning; multi-source; open-pit mines

1. Introduction With the development of remote sensing technology, target recognition has been widely used. It has important application significance and research value in guiding social economic construction and mineral resource development [1,2]. With the land encroachment and vegetation destruction caused by mining development becoming more prominent, the contradiction between mining activity and environmental protection is gradually intensifying [3,4]. As a research hotspot in the field of remote sensing image processing, target identification can effectively identify and dynamically monitor through the mine’s roughness texture and radiation intensity so as to guide environmental protection of the mine, as well as management and ecological restoration [5,6]. Throughout the history of rectification and standardization of mining operations in China, the mining management departments attached great importance to illegal and destructive mining activity. However, the identification and monitoring of open-pit mines mainly adopt traditional extraction manual methods [7,8]. The visual interpretation method needs abundant expert interpretation

Remote Sens. 2020, 12, 3474; doi:10.3390/rs12213474 www.mdpi.com/journal/remotesensing Remote Sens. 2020, 12, 3474 2 of 20 experience and detailed field survey data. Due to the low degree of automation, it is unsuitable for automatic identification of open-pit mines elements [9,10]. The traditional recognition of remote sensing information in open-pit mines can be divided into pixel-based and object-oriented methods. The extraction methods based on pixels mainly include Maximum Likelihood (MLE) classification, Decision Tree (DT) classification and Support Vector Machine (SVM) classification [11,12], which are methods that produce a lot of information redundancy in the extraction process. Object-oriented classification technology mainly determines classification rules artificially. For example, Hou et al. [13] established a multi-scale image space through object-oriented technology to classify the land cover of aerial images after an earthquake. Kang et al. [14] proposed a ground-object recognition method based on histogram feature knowledge for typical mine objects. However, the traditional extraction manual methods are greatly affected by subjective factors, and they are not universal enough to satisfy the demand of multi-source remote sensing image extraction in open-pit mines. With the development of remote sensing technology, the application of deep learning technology to remote sensing image information extraction has become a new technical trend [15]. Deep learning provides an approach to learn effective features automatically from a training set [16]. It can perform unsupervised characteristic learning from an enormous original image dataset such as hyperspectral and radar images. For example, Dang et al. [17] introduced a deep convolutional neural network to conduct classification of land cover based on an object-oriented classification system through the AlexNet model. In view of the overfitting phenomenon caused by insufficient label samples in the supervised training, Wang et al. [18] proposed a deep transit-based learning method and applied a deep residual network to hyperspectral image classification. Clabaut et al. [19] proposed a deep learning method based on convolutional neural networks, and relying on geo big data that can be used for the detection of gossans, this approach could provide a useful precursor tool to identify gossans prior to more detailed surveys using hyperspectral imaging. Cai et al. [20] applied the deep learning network model on a large scale to objectively divide the metallogenic area into a nonlinear spatial area, whose features can reflect the diversified geological data. According to the texture characteristics of each scene in open-pit mines, Yang et al. [21] adopted an end-to-end deep learning model to carry out super-resolution reconstruction with deep texture transfer and improved the spatial resolution of open-pit mines. Xiang et al. [22] proposed an improved UNet twin network structure to skip the corresponding layer of the encoding end, and they ultimately realized end-to-end mining change detection of remote sensing images. Comprehensive research results show that deep learning extraction models have better robustness, generalization, applicability and accuracy for remote sensing target recognition, which is an effective way to promote the application of automatic remote sensing extraction methods. Although the rapid development of image processing, pattern recognition and computer vision technology has created conditions for the improvement of geoscience target recognition technology, the application of target recognition in open-pit mines is not mature enough. Due to the fact that the content of remote sensing images is complex, and because the target source is diverse, the aspects that make the common feature extraction method cannot represent the target of remote sensing image fully and accurately. There are still many problems to be solved. As the contradiction between mining activity and environmental protection gradually intensifies, many scholars have conducted research on mines environmental damages. Wu et al. [23] discussed the semi-arid grassland landscape’s spatial distribution and the extent of the surface mining impact on the ecological health by means of landscape ecological health evaluation, buffer analysis, landscape ecological function contribution rate and modified landscape disturbance index measurement. Slavomir et al. [24] used the measurement results to visualize the spatial changes in the open-pit mines, which provided a necessary basis for taking relevant measures to recover the landscape affected by mining. Gong et al. [25] took full advantage of high-resolution topographic survey technology and algorithms to carry out accurate spatial analysis of serious water and soil erosion in open-pit coal mines in semi-arid areas of northern China. By studying the spatio-temporal dynamics of forest cover in the Mufu Mountain mining area, Zhang et al. [26] discussed the land cover changes and the landscape Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 19 Remote Sens. 2020, 12, 3474 3 of 20 dynamics of forest cover in the Mufu Mountain mining area, Zhang et al. [26] discussed the land coverremodeling changes caused and by the long-term landscape surface remodeling mining. caused It has great by long significance-term surface in realizing mining. the It coordinated has great significancedevelopment in of realizing mines and thethe coordinated ecological development environment. of mines and the ecological environment. With the aim of the aboveabove research, this paper proposed an Improved Mask R-CNNR-CNN (Region Convolutional NeuralNeural Network) Network) and and Transfer Transfer learning learning (IMRT) (IMRT) model model and constructedand constructed a set ofa multi-sourceset of multi- sourcemine sample mine sample database database consisting consist of Gaofen-1,ing of Gaofen Gaofen-2-1, Gaofen and Google-2 and Earth Google satellite Earth images satellite to automaticimages to automaticidentify and identify dynamically and dynamic monitorally open-pit monitor mines open [27-,28pit]. mines Meanwhile, [27,28]. a variety Meanwhile, of indicators a variety based of indicatorson pixels and based objects on pixel weres usedand object to evaluates were the used open-pit to evaluate mine the identification open-pit mine results identification quantitatively results and verifyquantitatively the feasibility and verify of the the proposed feasibility method. of the proposed The IMRT me modelthod. can The give IMRT full model playto can the give small full target play torepresentation the small target ability representation of low-level features, abilitywhich of low can-level improve features, the detection which can accuracy improve of open-pit the detection mines accuracyon the premise of open that-pit the mines detection on the accuracy premise of that conventional the detection targets accuracy is not affected. of conventional Based on targets these results, is not affected.multi-time Based series on monitoring these results, research multi on-time key series mining monitoring areas in Hubei research Province on key was mining finally areas studied. in Hubei Province was finally studied. 2. Study Area 2. Study Area The study area is situated in Hubei Province. Hubei Province is located in central China, and itsThe geographical study area is coordinatessituated in Hubei are Longitude Province. Hubei 108◦21 Province0~116◦08 0isE, located Latitude in central 29◦020~33 China,◦170 N[and29 its]. Thegeographical topography coordinates of the province are fluctuatesLongitude greatly, 108°21 and′~116°08 the landform′ E, Latitude is diversified. 29°02′~33°17 The whole′ N [29]. province The topographyspans two first-order of the province tectonic fluctuates units, the greatly Qinling, and fold the system landform and theis diversified. paraplatform. The whole province Various spanstypes oftwo magmatic first-order are tectonic widespread, units, the and Qinling metamorphic fold system rocks and are the rich Yangtze in mineral paraplatform. resources [Various30]. As typesshown of in magmatic Figure1, the are , widespread, copper, gold and and metamorphic silver deposits rocks in Hubeiare rich province in mineral are mainlyresources distributed [30]. As shownin Proterozoic in Figure metamorphic 1, the iron, copper, rocks, Mesozoicgold and silver magmatic deposits rocks in andHubei their province contact are metamorphic mainly distributed zone in inDaye, Proterozoic , ,metamorphic Yangxin, rocks, ZhushanMesozoic and magmatic Yunxi. rocks The phosphate and their contact mines are metamorphic located in the zone Han in ,River alluvialEzhou, plainHuangshi, and hilly Yangxin, region Zhushan such as Yichengand Yunxi. and The . phosphate Limestone mines are and located dolomite in the mines Han Riverare distributed alluvial plain in Paleozoic and hilly sedimentary region such strata as Yicheng in , and Zhongxiang. Wuchang, , Limestone Xiangfan and dolomite and Tongshan. mines areCoal distributed and pyrite mines in Paleozoic mainly exist sedim inentary the Paleozoic strata in sedimentary Yichang, Wuchang, rocks in Yichang, Jingmen, Enshi Xiangfan and Jianshi. and TheTongshan. quarries Coal are mainlyand pyrite distributed mines mainly in , exist , in the Paleozoic sedimentary and other places.rocks in The Yichang, experimental Enshi anddata Jianshi. are all from The thequarries key mining are mainly areas distributed of remote sensing in Shiyan, interpretation Suizhou, Huanggang in Hubei Province and other with places. about The5500 experimental km2, except for data , are all Tianmenfrom the andkey Qianjiang.mining areas of remote sensing interpretation in Hubei Province with about 5500 km2, except for Xiantao, and Qianjiang.

Figure 1. Landform and mine distribution map of Hubei Province Province..

Remote Sens. 2020, 12, 3474 4 of 20

Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 19 3. Methods 3. Methods 3.1. Improved Mask R-CNN 3.1. Improved Mask R-CNN As the Figure2 shows, based on the faster R-CNN framework, Mask R-CNN adds a parallel As the Figure 2 shows, based on the faster R-CNN framework, Mask R-CNN adds a parallel semantic segmentation branch to conduct the target detection and regression [31,32]. In this model, semantic segmentation branch to conduct the target detection and regression [31,32]. In this model, featurefeature pyramid pyramid networks networks (FPN) (FPN)+ + ResNet101ResNet101 is used to to ex extracttract the the features features of of the the backbone backbone network. network. AfterAfter feature feature extraction, extraction, this this model model used used the the Region Region Proposal Proposal Network Network (RPN) (RPN) to to carry carry out out end-to-end end-to- trainingend training of the target of the detection target detection frame in frame open-pit in open mines.-pit The mines. dilated The convolution dilated convolution was also involved was also in theinvolved feature calculationin the featu andre calculation finally used and the finally RoI (Region used the of RoI Interest) (Region Align of Interest) to solve theAlign problem to solve of the large actualproblem deviation of large in actual the original deviation image in the [33 original]. image [33].

FigureFigure 2. 2.Mask Mask R-CNNR-CNN network generalization diagram diagram..

3.1.1.3.1.1. Convolutional Convolutional Backbone Backbone andand DilatedDilated ConvolutionConvolution ConvolutionalConvolutional backbone backbone is is a aseries series featurefeature maps used by by convolutional convolutional layer layer to to extract extract open open-pit-pit mines,mines, and and it it mainly mainly applies applies the the structure structure ofof ResNet101.ResNet101. The The ResNet101 ResNet101 network network is is a aresidual residual network network proposedproposed by by four four scholars scholars from from Microsoft Microsoft Research, Research, and and its its internal internal residual residual block block uses uses skip skip connectsconnects to alleviateto alleviate theproblem the problem of gradient of gradient disappearance disappearance caused caused by increasing by increasing depth depth in the in network.the network. The T deephe networkdeep network is designed is designed as H(x )as= HF(x) =+ Fx(,x which) + x, which can also can bealso converted be converted to a to residual a residual function functionF(x )F=(x)H =( x) xH.(x As) − longx. As as longF(x as) = F0,(x) this = 0, formulathis formula constitutes constitutes an identityan identity map mapH( xH)(=x)x = sox so that that the the residuals residuals can can be − fittedbe fitted more more easily easily [34]. As[34]. Figure As Figure3 shows, 3 shows, Mask Mask R-CNN R-CNN divides divides the Resnet101 the Resnet101 network network into five into stages, five whichstages, are which denoted are as denoted C1, C2, as C3, C1, C4, C2, and C3, C5. C4, The and five C5. stages The five correspond stages correspond to the output to ofthe feature output maps of offeature five scales, maps which of five are scales, used which to build are theused feature to build pyramid the feature of the pyramid FPN network. of the FPN network. LocalLocal receptive receptive fields fields are are a a very very importantimportant concept in in the the Convolutional Convolutional Neural Neural Network Network (CNN). (CNN). WhenWhen CNN CNN performs performs instance instance segmentation,segmentation, on account of of the the final final feature feature map map size size being being much much smallersmaller than than the thesize size ofof the input input images, images, the the final final predicted predicted split split mask mask will willbe rough. be rough. However, However, the thedilated dilated convolution convolution can can control control the the rate rate of of the the convolution convolution kernel kernel and and obtain obtain different different convolution convolution receptive fields [35]. Therefore, dilated convolution solves the contradiction between improving the receptive fields [35]. Therefore, dilated convolution solves the contradiction between improving receptive fields and maintaining the size of the feature map in CNN. Figure 4a shows the local the receptive fields and maintaining the size of the feature map in CNN. Figure4a shows the local receptive fields with a traditional 3 × 3 convolution kernel, which is the same as the 3 × 3 dilated receptive fields with a traditional 3 3 convolution kernel, which is the same as the 3 3 dilated convolution kernel with a rate of 1. In× Figure 4b, when the dilated convolution kernel has a× rate of 2, convolution kernel with a rate of 1. In Figure4b, when the dilated convolution kernel has a rate of 2, the local receptive field of the convolution kernel increases to 7 × 7. In this paper, a dilated convolution the local receptive field of the convolution kernel increases to 7 7. In this paper, a dilated convolution kernel with a rate of 2 is added to the structure of the feature ×pyramid, and the dilated convolution kernel with a rate of 2 is added to the structure of the feature pyramid, and the dilated convolution

Remote Sens. 2020, 12, 3474 5 of 20

Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 19 Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 19 operation is carried out on the output features in each pyramid stage. Finally, the accuracy of mask operation is carried out on the output features in each pyramid stage. Finally, the accuracy of mask predictionoperation can is carried be effectively out on the improved output features in the category in each predictionpyramid stage. stage Finally, at the pixelthe accuracy level. of mask predictionprediction can can be be effectively effectively improved improved inin thethe categorycategory prediction stagestage at at the the pixel pixel level. level.

Figure 3. Feature pyramid networks (FPN) structure diagram. FigureFigure 3. 3. Feature Feature pyramid pyramid networks (FPN) structurestructure diagram. diagram.

(a) (b) (a) (b) Figure 4. Local receptive fields: (a) dilated convolution kernel with rate = 1; (b) dilated convolution FigureFigurekernel 4. 4. with LocalLocal rate receptive receptive = 2. fields: fields: ( (aa)) dilated dilated convolution convolution kernel kernel with with rate rate = 1;1; ( (bb)) dilated dilated convolution convolution kernelkernel with with rate rate == 2.2. 3.1.2. RPN Framework and RoI Align 3.1.2.3.1.2. RPN RPN Framew Frameworkork and and RoI RoI Align Align RPN is a Full Convolutional Neural (FCN) network. RPN carries out an end-to-end training of theRPN RPN open is is- pita a Full mines Convolutional target detection Neural Neural (FCN) frame(FCN) network. network. by adding RPN RPN additional carries carries out out category an an end end-to-end -to and-end regression training training of of thetheconvolutional open-pit open-pit mines mines layers target targeton detection the CNN. detection frame This frameframework by adding by additional laysadding anchor additional category with different and category regression proportions and convolutional regression on the convolutionallayersoriginal on image, the CNN. layers and, Thison at the framework same CNN. time This lays, it framework generate anchors with candidate lays di anchorfferent boxes proportionswith which different can on match proportions the originalthe open on image,-pit the originaland,mines at thetargetsimage, same ofand, time,various at itthe generatesscales same and time candidateextract, it generate the boxesboundary.s candidate which RPN can boxesfirst match traverses which the open-pit canthe CNNmatch mines feature the targetsopen map-pit of minesvariousoutput targets scaleswith ofa and 3 various × 3 extract slidin scalesg the window. boundary.and extract Mask RPN theR-CNN boundary. first establishes traverses RPN thean first m CNN ×traverses m featurebinary the mask map CNN outputto distinguishfeature with map a 3 the3 sliding front and window. rear scenes Mask of each R-CNN target establishes object with an them branchm binary of instance mask segmentation. to distinguish At the the frontcurrent and output× with a 3 × 3 sliding window. Mask R-CNN establishes× an m × m binary mask to distinguish therearposition front scenes and in of rear the each pixelscenes target space of objecteach of thetarget with open object the-pit branch with mines the of image, instancebranch the of segmentation. mappinginstance segmentation. point At of the the currentcenter At the slidingposition current window is the anchor [36,37]. Five feature maps, namely, P2, P3, P4, P5, and P6, with different scales positionin the pixel in the space pixel of the space open-pit of the mines open-pit image, mines the image, mapping the point mapping of the point center of slidingthe center window sliding is are several anchor boxes generated by the RPN, and the preset proposal can obtain the size and windowthe anchor is the [36 ,anchor37]. Five [36,37]. feature Five maps, feature namely, maps P2,, namely, P3, P4, P2, P5, P3, and P4, P6, P5, with and di P6fferent, with scales different are several scales coordinates of corresponding areas in the open-pit mines’ target image, where P1, P2, P3, P4, and P5 areanchor several boxes anchor generated boxes by generated the RPN, by and the the RPN, preset and proposal the preset can proposal obtain the can size obtain and coordinates the size and of are the feature pyramid. Mask R-CNN abandons the P1 characteristics of stage1 and takes down coordinatescorresponding of corresponding areas in the open-pit areas in mines’ the open target-pit image, mines’ where target P1,image P2,, P3,where P4, andP1, P2, P5 P3, are P4, the and feature P5 sampling based on stage5 (P5) to obtain P6 characteristics. Finally, the five feature maps of different arepyramid. the feature Mask pyramid. R-CNN abandonsMask R-CNN the P1 abandons characteristics the P1 ofcharacteristics stage1 and takes of stage1 down and sampling takes down based scales (P2, P3, P4, P5, and P6) are input into the RPN to generate RoI, respectively. Due to different samplingon stage5 based (P5) to on obtain stage5 P6 (P5) characteristics. to obtain P6 characteristics. Finally, the five Finally, feature the maps five offeature different maps scales of different (P2, P3, P4,stride P5, andlengths, P6) RoI are inputAlign was into performed the RPN to on generate the corresponding RoI, respectively. stride of Duefour tofeature different maps stride (P2, P3, lengths, P4, scalesand (P5P2,) withP3, P4, different P5, and scales. P6) are RoI input Align into aims the to RPN solve to the generate problem RoI of, largerespectively. actual deviation Due to different in the strideRoI Align lengths, was RoI performed Align was on performed the corresponding on the corresponding stride of four stride feature of maps four feature (P2, P3, maps P4, and (P2, P5) P3, with P4, dioriginalfferent scales.image caused RoI Align by the aims formation to solve of the candidate problem regions of large in actualthe quantization deviation process in the original [38]. Concat image andconnection P5) with is different conducted scales. based RoI on Alignthe RoI aims Align to generated solve the andproblem divide ofs thelarge network actual into deviation three parts: in the caused by the formation of candidate regions in the quantization process [38]. Concat connection originalthe full imagey connected caused by prediction the formation class, theof candidate fully connected regions predictivein the quantization rectangle process box, and [38]. the Concat full connection is conducted based on the RoI Align generated and divides the network into three parts: the fully connected prediction class, the fully connected predictive rectangle box, and the full

Remote Sens. 2020, 12, 3474 6 of 20 is conducted based on the RoI Align generated and divides the network into three parts: the fully connected prediction class, the fully connected predictive rectangle box, and the full convolution predicts pixel segmentation. These three parts are represented by the mask, the target border and the border’s class of open-pit mines, respectively.

3.1.3. Border Regression and Loss Function

For a given border (Px, Py, Pw, Ph), target border regression is utilized to obtain the final regression border (Fx, Fy, Fw, Fh) and make it closer to the real border (Gx, Gy, Gw, Gh). That is, we need to find a mapping f, such that f (Px, Py, Pw, P ) = (Fx, Fy, Fw, F ), and (Fx, Fy, Fw, F ) (Gx, Gy, Gw, G ), h h h ≈ h where the subscripts x, y, w, and h represent the horizontal distance, the vertical distance, the width and the height of the three types border of center points, respectively. The border regression learns about these transformations, and translation (tx, ty) and scale zooming (tw, th) are calculated based on the parameters of the real border and the predicted border. The calculations are as follows:

tx = (Gx Px)/Pw (1) −  ty = Gy Py)/P (2) − h = Gw/Pw tw log2 (3) = Gh/Ph th log2 (4) T T The objective function is expressed as F (P) = d ϕ5(P), d is the parameter to learn (* represents x, ∗ ∗ ∗ y, w, h), ϕ5(P) is the eigenvector that predicts the border, and F (P) is the regression value obtained. ∗ The goal is to minimize the difference between the regression value and the true value t = (tx, ty, tw, ∗ th). The loss function obtained is as follows:

Xi i T i 2 Lossreg = (t d ϕ5(P )) (5) N ∗− ∗ Discrete probability distribution p is used to represent the probability of the target and background ( 0 negative label of the open-pit mines, in which the real border is labeled as p = . The loss function ∗ 1 positive label is obtained as follows: Loss = log[p∗p+(1 p ∗)(1 p)] (6) cls − − − The mean value of the sigmoid function was calculated for each pixel of the mask, which was defined as the average binary cross entropy loss function LOSSmask. This method can effectively improve the effect of instance segmentation. Therefore, the loss function of Mask-RCNN consists of the three loss functions, including classification loss, regression loss and segmentation loss [39]. The total loss function is as follows:

LOSS = LOSSreg + LOSScls + LOSSmask (7)

3.2. Transfer Learning The essence of transfer learning is the transfer and reuse of knowledge. The existing knowledge is called the source domain, while the new knowledge which needs to be learned is called the target domain [40]. According to the definition of transfer learning, it can be divided into three types: distributed differential transfer learning, characteristic differential transfer learning and tag differential transfer learning. Distributed differential transfer learning refers to the difference in the marginal distribution or conditional probability distribution between the source domain and the target domain. Characteristic differential transfer learning refers to the difference in the feature space between the source domain and the target domain. Tag differential transfer learning refers to the difference Remote Sens. 2020, 12, 3474 7 of 20

Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 19 in the tag space between the source domain and the target domain. Generally, the target domain is diRemotefferent Sens. fromMathematically, 2020 the, 12, sourcex FOR PEER domain transfer REVIEW in learning terms of contains data distribution, two elements: characteristic domains and dimensions tasks [42]. and The model7 domainsof 19 Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 19 outputU contains change conditions two elements: [41]. sample feature space Ӽ and the probability distribution p(x) of the overall x, whereMathematically, the sample transferset X = ( xlearning, x , … ,contains x ) ∈ Ӽ ,two φ( Ӽ elements: ) = ∏n p( domainsx ). For a domainand tasks U [42].= {Ӽ, φ The(Ӽ)}, domains we defined Mathematically,Mathematically, transfer transferlearning learning1 2 containscontainsn two elements: i=1 d domainsomainsi and and tasks tasks [42]. [42 ].The The domains domains U containsthe task two Г on elements: U as containing sample featuretwo elements: space Ӽ the and label the probabilityspace У and distribution the function p (fx, )where of the У overallis a discrete x, UUcontains contains two two elements: elements: samplesample feature feature space space Ӽ andand the the probability probability distribution distribution p(x)p of(x )the of ov theerall overall x, nQi=   whererandom the sample variable set Xwith = (x uniform, x , … , xdistribution.) ∈ Ӽ, φ( Ӽ ) The= n∏ setp( xY1) .= For (y a, ydomain, … ,y U ) ∈= {У Ӽ, φis( Ӽthe)}, wesample defined of the x,where where the the sample sample set set X X= (=x1(,x x12,, x…2, ,... x n,) x∈n )Ӽ, φ(,Ӽϕ) =( ∏) =ip=1(xn). iForp(x ai )domain.1 For2 a domainU n= {Ӽ, φU(Ӽ)=}, {we, ϕdefined}, we 1 2 n ∈ i=1 i definedthethe taskpopulation,task the ГГ onon task U as Гandon containing UУ asis called containing twotwo the elements: elements: label two elements:space t hethe label labelof Yspace the ,space where label У У and andf space: Ӽthe the Уfunction functioncanand be the f , obtainedwhere functionf, where У is fromУ f ,a whereis discrete a thediscrete training is random variable with uniform distribution. The set Y = (y ,y , … ,y ) ∈ У is the sample of the a discreterandomsample random variable {xi,y } variable, withxi ∈ unX, withiformy ∈ uniformY distribution.. From distribution.the view The of set Theprobability,Y = set (yY,y1=, ( 2…yf 1can ,,yy ) ,be n∈... thoughtУ, yis ) the ofsisample theas the sample of conditional the of i i 1 2 n2 n ∈ thepopulation,population, population,probability and and p У(У y|xis is called ),iscalled and called the thethe label the task label label space can space space ofbe Y represented ,of w of hereYY, ,where where f: Ӽ as Уff :can : ГӼ = be { УУ obtained , canp(y | bex)}. obtained obtainedfrom The thetwo training from fromtasks the the are sample training trainingdifferent, if , ∈ ∈ → samplesample{x ,yand}, { x {onlyix, i∈y,y iX}, } ,ifyx xithe i∈ Y XlabelX. ,From,yyi space theY. From view У has of thethe atprobability, view viewleast ofoneof probability,probability, fdifference can be thought fwhenf cancan of be becompared as thought thought the conditional of toof asconditional as the the probability conditional conditional probability i i i i i ∈ i ∈ probabilityprobabilityp p(y(y|x|x),), and and the the task task can can be be represented represented as asГ Г = ={ {У, ,p p(y(y|nx|S)}.x)}. The The two two tasks tasks are are di different,fferent, if if p(y|xp()y,| andx). Whenthe task the can source be represented domain ‐aslabeled Г = {У, datap(y|x )}.DS The= {x twoSi,y tasks} and are targetdifferent, domain if and nononly‐ labeledif the data Si i=1 andandlabel only only space if if the Уthen hasT label label at least space space one У hasdifferen has at at least celeast when one one dicompared differencefference to when when conditional compared compared probability to to conditional conditional p(y|x). When probability probability the DT = {xT } are fixed, we can finally minimizen the binarynSn Sloss function ∑i=1 l f (x ),y n to improve pp(y(y|x|).x). When Wheni thei=1 the source source domain-labeled domain ‐labeled data SdataDS =D x= {x, y,y } .and and target domaindomain non-labeledTnon Ti‐labeledTi T data data source domain-labeled data DS = {xSi,y } and targetS Si S domainiSi Si=1 non-labeled data DT = {xTi} are Si i=1 { } i i=1   i=1 the learningnT effect of the target task. In the absence of the target domainP calibration data, knowledge DT = x nT are fixed, we can finally minimize the binary loss function l fT(x ), y to improve D = {x T}i i= 1are fixed, we can finally minimize the binary∑ loss( function )i∑=1 l f (xTi ),yTi to improve fixed,T transfer{ T wei } can can finally be accomplished minimize the by b reducinginary loss the function distribution i=1 l differencefT(xT ),yT betweentoi=1 improveT theT source theT learning domain and the learningi=1 effect of the target task. In the absence of the target domaini i calibrationi data,i knowledge theeffect learningthe of target the targeteffect domain oftask. the [43]. In target the absence task. In of the the absence target domain of the targetcalibration domain data, calibration knowledge data, transfer knowledge can transfer can be accomplished by reducing the distribution difference between the source domain and transferbe accomplished canDue be to accomplished the by smallreducing number bythe reducing distribution of open the‐pit differencedistribution mine manual between difference labeling the sourcebetween target domain datasets, the source and the the domain Mask target andR ‐CNN the target domain [43]. thedomain targetnetwork [43]. domain should [43]. be pre‐trained on the dataset firstly to prevent the model from overfitting. It can be Due to the small number of open-pit mine manual labeling target datasets, the Mask R-CNN seenDueDue that toto the the small label spacenumbernumber of oftheof openopen source-pit‐pit domainmine mine man manual (a ualdataset labeling labeling has alreadytarget target datasets, been datasets, trained) the the Mask andMask R target-CNN R‐CNN domain network should be pre-trained on the dataset firstly to prevent the model from overfitting. It can networknetwork(untrained shouldshould open be pre‐pit-‐trainedtrained mines dataset)onon the the dataset dataset are different, firstly firstly to toand prevent prevent the transfer the the model model learning from from overfitting. of overfitting. open‐pit It mines can It can be belongs be be seen that the label space of the source domain (a dataset has already been trained) and target seenseento thatthat tag thethe‐differential label space transfer ofof thethe source sourcelearning. domain domain Thus, (a (awe dataset dataset must has makehas already already full beenuse been of trained) thetrained) Mask and and Rtarget‐CNN target domain in domain the source domain (untrained open-pit mines dataset) are different, and the transfer learning of open-pit mines (untrained(untraidomainned open to guide‐-pit mines the target dataset)dataset) recognition are are different, different, of andthe and thenew the transfer transferopen‐ pitlearning learning mine ofdataset ofopen open- pit[44].‐pit mines minesMask belongs Rbelongs‐CNN has belongs to tag-differential transfer learning. Thus, we must make full use of the Mask R-CNN in toto tagtagalready‐-differentialdifferential trained transfer the weights learning.learning. for Thus, automaticThus, we we must mustidentification make make full full useof use about of ofthe the 80Mask Maskcategories, R- CNNR‐CNN suchin thein asthe source aircraft source and thedomain source to domain guide to the guide target the recognition target recognition of the new of the open new-pit open-pit mine dataset mine dataset [44]. Mask [44]. R Mask-CNN R-CNN has domainpedestrians. to guide By the selecting target recognition pre‐trained ofResNet101 the new toopen initialize‐pit mine the datasetmodel, the[44]. transfer Mask Rvalues‐CNN of has open ‐ hasalready already trained trained the theweights weights for automatic for automatic identification identification of about of about 80 categories, 80 categories, such as such aircraft as aircraft and alreadypit mines trained were the savedweights in fora dataset automatic (source identification domain). Asof aboutthe Figure 80 categories, 5 shows, suchthe pre as ‐aircrafttrained andprocess andpedestrians. pedestrians. By selecting By selecting pre-trained pre-trained ResNet101 ResNet101 to initialize to initialize the model, the model,the transfer the transfervalues of values open- of pedestrians.generated By 1024 selecting features pre through‐trained the ResNet101 fully connected to initialize layer, the and model, in the the Softmax transfer layer, values 80 features of open were‐ open-pitpit mines mines were were saved saved in a in dataset a dataset (source (source domain). domain). As Asthethe Figure Figure 5 shows5 shows,, the the pre pre-trained-trained process process pit minesfinally were generated. saved inOn a datasetthe basis (source of source domain). domain As theweight, Figure we 5 usedshows, transfer the pre learning‐trained processto find the generatedgenerated 1024 1024 features features through through thethe fullyfully connectedconnected layer, layer, and and in in the the Softmax Softmax layer, layer, 80 80 features features were were generatedcategory 1024 with features the closest through characteristics the fully connected of open‐ pitlayer, mines, and andin the the Softmax generalization layer, 80 performance features were of the finallyfinally generated. generated. On On the the basis basis of source of source domain domain weight, weight, weused we use transferd transfer learning lear toning find to the find category the finallymodel generated. was greatly On improved.the basis of source domain weight, we used transfer learning to find the withcategory the closest with the characteristics closest characteristics of open-pit of open mines,-pit andmines, the and generalization the generalization performance performance of the of model the category with the closest characteristics of open‐pit mines, and the generalization performance of the wasmodel greatly was improved. greatly improved. model was greatly improved.

FigureFigure 5. Transfer 5. Transfer learning. learning. Figure 5. Transfer learning. 4. Experiments Figure 5. Transfer learning. 4. Experiments 4. Experiments4.1. Experiment Data 4.1. Experiment Data 4.1. 4.1.1.Experiment Data SourceData and Identification Index 4.1.1. Data Source and Identification Index Data sources include image data and auxiliary data. Image data are mainly used for extraction 4.1.1. DataData sourcesSource includeand Identification image data Index and auxiliary data. Image data are mainly used for extraction andand verification verification of openof open-pit ‐ mines,pit mines, while while auxiliary auxiliary data aredata mainly are mainly used toused show to theshow spatial the spatial Data sources include image data and auxiliary data. Image data are mainly used for extraction

and verification of open‐pit mines, while auxiliary data are mainly used to show the spatial

Remote Sens. 2020, 12, 3474 8 of 20

4. Experiments

4.1. Experiment Data

4.1.1. Data Source and Identification Index Data sources include image data and auxiliary data. Image data are mainly used for extraction and verification of open-pit mines, while auxiliary data are mainly used to show the spatial distributionRemote Sens. 20 characteristics20, 12, x FOR PEER of REVIEW open-pit mines. We used Gaofen-1 and Gaofen-2 satellite images8 of 19 to monitor the geological environment of open-pit mines, and for areas that were not covered or had uncleardistribution images, characteristics we supplemented of open the-pit image mines. data We usedwith GaofenGoogle-1 Earth and satelliteGaofen-2 imagessatellite of images the same to period.monitor The the satellite geological image environment data time of is fromopen- Januarypit mines, to and August for areas 2019. that All were thesatellite not covered images or had have beenunclear preprocessed images, we by supplemented radiometric calibration, the image atmospheric data with Google correction, Earth satellite orthographic images correction of the same and imageperiod. fusion. The satellite According image to thedata open-pit time is from mines January distribution to August of mineral 2019. All resources the satellite in Hubei images Province, have thebeen identification preprocessed indexes by radiometric of main open-pit calibration, mines atmospheric are established correction, according orthographic to the mining correction degree and and spatialimage texture fusion.characteristics According to the of open surface-pit [mines45]. The distribution interpretation of mineral marks resources established in Hubei in the Pr open-pitovince, minesthe identification were verified indexes by field of investigation, main open-pit and mines the are interpretation established according signs are summarizedto the mining asdegree Figure and6. spatial texture characteristics of surface [45]. The interpretation marks established in the open-pit (1)minesAfter were the verified rocks and by soilsfield investigation, are stripped, theand surfaces the interpretation of open-pit signs mines are aresummarized mainly steep as Figure walls 6 and. (1) platforms.After the rocks The bedrock and soils is are bare stripped, and fresh, the andsurface somes of of open the- platformspit mines are mainly piled up steep with walls the wasteand residue.platforms. Therefore, The bedrock the features is bare and of open-pit fresh, and mines some in of true-colorthe platforms images are piled mainly up appearwith the white waste or yellow,residu ande. Therefore, there are the multiple features steps of open or steep-pit mines slopes in connected true-color with images roads. mainly appear white or (2) Inyellow, the false-color and there image, are multiple the diff stepserence or betweensteep slopes the connected open-pit mines with roads. and the images background (2) isIn obvious. the false The-color open-pit image, mine’sthe difference surface between is mainly the gray open or-pit gray-black mines and with the images a simple backgroun texture andd regionalis obvious. block, The which open- ispit in mine great’s contrastsurface is with mainly the backgroundgray or gray forest-black vegetationwith a simple that texture a has roughand textureregional and block, reddish which brown is in great reflection. contrast with the background forest vegetation that a has rough texture and reddish brown reflection.

(a) (b) (c) (d)

(e) (f) (g) (h)

FigureFigure 6. 6.Identification Identification index. index. True-colorTrue-color images: images: (a ()a )iron iron m mineine in inQichun ; County; (b) (qbuarry) quarry mine mine in in Yangxin County; ( (cc)) g graniteranite m mineine in in Xingshan ; County; (d (d) )c coaloal m mineine in in Huangshi Huangshi City; City; false false-color-color images:images: (e )(e coal) coal mine mine in in Yuan’an Yuan`an County; County; ( f(f)) quarry quarry minemine in Chibi City; City; ( (gg)) q quarryuarry m mineine in in Chongyang Chongyang County;County; (h ()h coal) coal mine mine in in Daye Daye City. City.

4.1.2. Sample Database The sample database of IMRT is mainly divided into user samples and model samples. User samples are mainly extracted from field investigation or visual interpretation of remote sensing images, which are the basis of model samples. Based on the user samples, remote sensing images are converted into a sample format, which is required for the model training and testing through a series of data processing. The experimental source domain data are in the COCO dataset, which has a total of 80 samples with labeled categories. The target domain data are the non-labeled open-pit mines images. The full name of MS COCO is Microsoft Common Objects in Context, which is an object detection and segmentation dataset. The target from complex daily scenes is mainly extracted, and the target of interest can be calibrated by precise segmentation [46,47]. MS COCO uses json files to

Remote Sens. 2020, 12, 3474 9 of 20

4.1.2. Sample Database The sample database of IMRT is mainly divided into user samples and model samples. User samples are mainly extracted from field investigation or visual interpretation of remote sensing images, which are the basis of model samples. Based on the user samples, remote sensing images are converted into a sample format, which is required for the model training and testing through a series of data processing. The experimental source domain data are in the COCO dataset, which has a total of 80 samples with labeled categories. The target domain data are the non-labeled open-pit mines images. The full name of MS COCO is Microsoft Common Objects in Context, which is an object detection and segmentation dataset. The target from complex daily scenes is mainly extracted, and theRemo targette Sens. of 20 interest20, 12, x FOR can PEER be calibratedREVIEW by precise segmentation [46,47]. MS COCO uses json9 files of 19 to store the information about target of interest, and it classifies all data into training sets (validation sets store the information about target of interest, and it classifies all data into training sets (validation are included in the training sets) and test sets. These mainly include object instances, object key points sets are included in the training sets) and test sets. These mainly include object instances, object key and image captions. points and image captions. In this paper, the open-pit mine interpretation symbol database was established from the color In this paper, the open-pit mine interpretation symbol database was established from the color and texture index of remote sensing images, and on this basis, a training open-pit mine sample and texture index of remote sensing images, and on this basis, a training open-pit mine sample dataset of IMRT was made on the structure of the MS COCO classic sample dataset. The experimental dataset of IMRT was made on the structure of the MS COCO classic sample dataset. The experimental datadata were were semantically semantically segmentedsegmented in the open open-pit-pit mines mines of of each each image, image, and and th thee resolution resolution of ofeach each imageimage was was twotwo meters.meters. The The open open-pit-pit mine mine sample sample dataset dataset mainly mainly adopts adopts the following the following format: format: the thecv2_mask cv2_mask folder folder mainly mainly contains contains all the all 8-bit the mask 8-bit labels mask needed labels for needed training for; the training; json folder the jsonincludes folder includesthe json thefile that json represents file that represents the coordinates the coordinates of the training of theobject training; the labelme_json object; the folder labelme_json contains folder the containsannotation the annotationsamples for samples the semantic for the segmentation semantic segmentation of open-pit mines of open-pit in img.png, mines info.yaml, in img.png, label.png, info.yaml, label.png,label_names.txt label_names.txt and label_v andiz.png label_viz.png;; and the pic and folder the mainly pic folder contains mainly all the contains image alldata the in image jpg format. data in jpgThe format. structure The is structure shown in is Figure shown 7. in Figure7.

cv2_mask json label_json pic

FigureFigure 7.7. Dataset folder folder of of t trainingraining data. data.

4.2.4.2. Training Training Environment Environment and and FunctionFunction AnalysisAnalysis ThisThis experiment experiment was was conducted conducted in a in Windows a Windows 64-bit 64 operating-bit operating system system with 16GB with running 16GB running memory. Amemory. quad-core A Intel quad CORE-core I5 Intel 9th CORE Gen CPU I5 9th was Gen configured, CPU was and configured a GeForce, GTXand a 1650TI GeForce graphics GTX 1650TIcard was equipped.graphics Thecard running was equipped. environment The running of this model environment is python3.5.6, of this and model the isframework python3.5.6 is, TensorFlow. and the Theframework training ofis T theensorFlow. IMRT model The training mainly dependsof the IMRT onthe model libraries, mainly such depends as TensorFlow, on the libraries Keras,, such OpenCV as andTensorFlow, PIL. TensorFlow Keras, usesOpenCV a data and flow PIL. diagram TensorFlow to design uses thea data computational flow diagram flow, to which design enables the thecomputational users to train flow large-scale, which neural enable networkss the users with to parallel train large operations.-scale neural Keras networksis an advanced with library parallel for rapidlyoperations. prototyping Keras is deepan advanced learning, library and for it has rapidly an excellent prototyping expansibility deep learning, for TensorFlow. and it has an PILexcellent mainly obtainsexpansibility the color for of TensorFlow. the pictures PIL by comparingmainly obtains them the with color the of color the library.pictures On by thecomparing other hand, them OpenCV with recognizesthe color colorslibrary by. O distinguishingn the other hand, the HSVOpenCV (Hue, re Saturation,cognizes colors Value) by components distinguishing of thethe picture HSV (Hue, strictly. TheseSaturation, libraries Value) provide components a good foundation of the picture for usstrictly. to obtain These the libraries low-lever provide and the a good high-level foundation features for of us to obtain the low-lever and the high-level features of open-pit mine images better. open-pit mine images better. The size of sample datasets was set to 600 × 600, and the true-color images (experimental The size of sample datasets was set to 600 600, and the true-color images (experimental samples) and the false-color images (reference samples)× were trained, respectively. The total number samples) and the false-color images (reference samples) were trained, respectively. The total number of experimental samples is 600, and there were 400 reference samples. The experimental samples and of experimental samples is 600, and there were 400 reference samples. The experimental samples and reference samples were trained in batches. With an initial learning rate of 10−3 [48], the network had reference samples were trained in batches. With an initial learning rate of 10 3 [48], the network had a total of 100,000 training epochs. During the experiment, the COCO dataset divided− all the data into a total of 100,000 training epochs. During the experiment, the COCO dataset divided all the data into a training set and a test set, among which the validation set is included in the training set. a trainingBy controlling set and a test the set,constant among of bat whichch size, the we validation planned set to analyze is included the accuracy in the training changes set. in training and validation accuracy by adjusting the ratio of the training set and validation set. After that, we controlled the ratio between the training set and the validation set constant again to obtain the influence of batch size changes on the training and validation accuracy. As shown in Figure 8, the validation accuracy of the true-color image with a ratio of 70%:30% was lower than that with a ratio of 80%:20%, however, the training accuracy of the true-color image as well as the training and validation accuracies of the false-color image were higher than other ratios. When the ratio of the training set and validation set remains unchanged in the process of batch size increasing, the accuracy of the true-color images and false-color images presents an overall trend of increasing and then regional stability. As shown in Figure 9, when the batch size is 1000, there is an advantage. Therefore, the ratio of the training set and validation set is 70%:30%, and the batch size is 1000 to participate in the open-pit mine identification.

Remote Sens. 2020, 12, 3474 10 of 20

By controlling the constant of batch size, we planned to analyze the accuracy changes in training and validation accuracy by adjusting the ratio of the training set and validation set. After that, we controlled the ratio between the training set and the validation set constant again to obtain the influence of batch size changes on the training and validation accuracy. As shown in Figure8, the validation accuracy of the true-color image with a ratio of 70%:30% was lower than that with a ratio of 80%:20%, however, the training accuracy of the true-color image as well as the training and validation accuracies of the false-color image were higher than other ratios. When the ratio of the training set and validation set remains unchanged in the process of batch size increasing, the accuracy of the true-color images and false-color images presents an overall trend of increasing and then regional stability. As shown in Figure9, when the batch size is 1000, there is an advantage. Therefore, the ratio of the training set and validation set is 70%:30%, and the batch size is 1000 to participate in the open-pit mine identification. Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 19

0.970

0.960

0.950

0.940

0.930 Accuracy 0.920

0.910

0.900

0.890 training accuray validation accuray training accuray validation accuray (true-color image) (true-color image) (false-color image) (false-color image)

50%:50% 60%:40% 70%:30% 80%:20%

Figure 8. Training and validation accuracies of the true-color image and false-color image under four Figure 8. Training and validation accuracies of the true-color image and false-color image under four different ratios of training and validation samples. different ratios of training and validation samples.

0.965

0.960

0.955

0.950

0.945

0.940

Accuracy 0.935

0.930

0.925

0.920

0.915 50 100 200 300 500 1000 1500 2000 3000

training accuray validation accuray (true-color image) (true-color image) training accuray validation accuray (false-color image) (false-color image)

Figure 9. Training and validation accuracies of the true-color image and false-color image under different batch sizes.

Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 19

0.970

0.960

0.950

0.940

0.930 Accuracy 0.920

0.910

0.900

0.890 training accuray validation accuray training accuray validation accuray (true-color image) (true-color image) (false-color image) (false-color image)

50%:50% 60%:40% 70%:30% 80%:20%

Remote Sens.Figure2020 8., 12 Training, 3474 and validation accuracies of the true-color image and false-color image under four 11 of 20 different ratios of training and validation samples.

0.965

0.960

0.955

0.950

0.945

0.940

Accuracy 0.935

0.930

0.925

0.920

0.915 50 100 200 300 500 1000 1500 2000 3000

training accuray validation accuray (true-color image) (true-color image) training accuray validation accuray (false-color image) (false-color image)

Figure 9. Training and validation accuracies of the true-color image and false-color image under Figure 9. Training and validation accuracies of the true-color image and false-color image under different batch sizes. different batch sizes.

4.3. Accuracy Evaluation The evaluation of open-pit mine extraction accuracy is mainly carried out in two ways: pixel-based evaluation and object-based evaluation. The pixel-based evaluation methods evaluate the extracted pixels and aim to reflect the consistency of the extracted results in geometric accuracy and shape similarity. The object-based evaluation methods can avoid the pixel bias error. With the purpose of analyzing the potential causes of extraction errors, the evaluation results can be correlated with the parameters by quantifying the number of extracted targets. In order to solve the problems such as a fuzzy boundary or complex internal structure, in this paper, we used the two accuracy evaluation methods to evaluate the extraction results. The evaluation indexes based on pixel are mainly composed of pixel accuracy (PA), comprehensive evaluation index (F1) and Kappa coefficient [49]. PA is an evaluation index to calculate the matching proportion of the predicted pixel values and the real pixel values. The higher the PA, the higher matching degree of the predicted value and the real value. F1 can be regarded as the binary classification problem of open-pit mine targets and backgrounds. Kappa coefficient represents the coincidence degree between the classified image and the reference image, and it is an objective evaluation standard to test their consistency. Pk i= pij PA = 0 (8) Pk Pk i=0 j=0 pij Precision Recall F1 = 2 ∗ (9) × Precision + Recall po p Kappa = − e (10) 1 p − e where po is the number of the correctly classified samples divided by the total number of samples, and pe is the number of the misclassified samples divided by the total number of samples. Remote Sens. 2020, 12, 3474 12 of 20

In terms of object-based evaluation, Precision, Recall, FalseAlarm and MissingAlarm [50–52] are mainly used to quantitatively evaluate the predicted results. Precision is the ratio between the number of open-pit mines extracted correctly (TP) and the count of the extraction open-pit mines identified by IMRT. Recall is the ratio between the number of open-pit mines and the number of open-pit mines in the sample database. FalseAlarm is the ratio between the number of open-pit mine target identifications wrongly (FP) extracted and the count of the identification results. MissingAlarm is the ratio between the number of omission extractions (FN) and the number of open-pit mines in the sample database.

TP Precision = (11) TP + FP

TP Recall = (12) TP + FN FP FalseAlarm = (13) TP + FN FN MissingAlarm = (14) TP + FN

5. Discussion

5.1. Open-Pit Mine Identification Results Figure 10 compares the intelligent extraction model with traditional extraction methods of multi-source remote sensing. The extraction results of SVM and MLE have better integrity on the whole, but these traditional extraction methods are mainly affected by the surface objects with high reflectivity, such as bare ground and road, so it is easy for them to be misclassified. Faster R-CNN can locate and classify the target on the spatial position of each open-pit mines accurately, but it lacks the boundary mask information. Compared with the three methods, the IMRT model has better performance in the comprehensive locating boundary precision, the fragmentation degree and the integrity of the extraction boundary of the extraction results. The accuracy evaluation method described above is used to quantify the extraction results of IMRT, faster R-CNN, SVM and MLE models. The pixel-based accuracy evaluation results are shown in Table1, and the object-based accuracy evaluation results are shown in Table2. The comparison results are as follows: In true-color images, SVM and MLE extracted the pixels belonging to the supervised classification characteristics of open-pit mines as much as possible, but there were also obvious errors in the extraction results. In Figure 10A, the two traditional extraction methods mistakenly divided the marginal road into an open-pit mine. As shown in Figure 10C, farmland and other land types were classified as open-pit mines. During the influence of spectral characteristics, the open-pit mine extraction of the MLE classification was severely broken and low in accuracy. The extraction results of false-color images by all methods performed well. However, the deep learning model can extract the complete boundary information of block regions. To some extent, the undeveloped areas in the center of open-pit mines also participated in the overall target extraction, so the traditional model performed a little better than the deep learning method, as shown in Figure 10B. Faster R-CNN had a relatively good identification ability for open-pit mines, but for large open-pit mines, the selection area of the identification box was too large. As shown in Figure 10D, when many targets are in the area, it is difficult to truly locate the open-pit mine targets in the image. Compared with the traditional extraction method, the IMRT model was able to extract the complete structures and obvious edge features of small open-pit mines, and it did not cause leakage or misclassification. As shown in Figure 10D, because the false-color image eliminates the influence of water vapor, the extraction effect of open-pit mines with the false-color image is even better than the true-color image. That is to say, the extraction results obtained by IMRT are more consistent with the real surface values. Remote Sens. 2020, 12, 3474 13 of 20

In terms of pixel-based evaluation, among the two traditional extraction methods, the results of SVM have a better performance, the index of F1 is 0.7148, and the Kappa coefficient is 0.6943, but the PA is slightly lower than that of the MLE, which is 0.9320. For the extraction results of the IMRT model and the faster R-CNN model, the evaluation indexes of the IMRT model are higher than the other one. Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 19 However, the lowest value of accuracy index in the deep learning method is higher than the highest oneopen of- thepit mines.traditional Compa method,red with indicating faster R that-CNN, compared IMRT has with a better the traditional performance method, on MissingAlarm deep learning, haswhich more proves advantages that this in model open-pit has minea higher recognition. accuracy of The identification. IMRT model Thus, has IMRT the best is more performance suitable for in PA andtheKappa identificationcoefficient, of open with-pit values mine ofs with 0.9718 multi and-source 0.8251, remote respectively. sensing images.

(a) (b) (c) (d) (e) A. Silica mine in .

(a) (b) (c) (d) (e) B. Granite mine in Xingshan County.

(a) (b) (c) (d) (e) C. Colliery mine in Yuan’an County.

(a) (b) (c) (d) (e) D. Quarry mine in Chibi City.

FigureFigure 10. 10. MineMine identification identification reresults.sults. (a (a) )Improved Improved Mask Mask R- R-CNNCNN and and Transfer Transfer learning learning (IMRT (IMRT)) identificationidentification results results of of the the true-color true-color image; image; ( b(b)) IMRT IMRT identification identification resultsresults ofof thethe false-colorfalse-color image;image; (c) faster(c) faster R-CNN R-CNN identification identification results; results; (d) Maximum (d) Maximum Likelihood Likelihood (MLE) (MLE identification) identification results; resul (e)ts; Support (e) VectorSupport Machine Vector (SVM)Machine identification (SVM) identification results. results.

Table 1. Precision evaluation based on pixels.

IMRT faster R-CNN SVM MLE PA 0.9718 0.9454 0.932 0.9438 F1 0.8377 0.8465 0.7149 0.6514 Kappa coefficient 0.8251 0.7955 0.6514 0.6733

Remote Sens. 2020, 12, 3474 14 of 20

Table 1. Precision evaluation based on pixels.

IMRT faster R-CNN SVM MLE PA 0.9718 0.9454 0.932 0.9438 F1 0.8377 0.8465 0.7149 0.6514 Kappa coefficient 0.8251 0.7955 0.6514 0.6733

Table 2. Precision evaluation based on objects.

IMRT faster R-CNN SVM MLE Precision 0.8667 0.8857 0.4778 0.5125 Recall 0.9138 0.7391 0.8696 0.6826 MissingAlarm 0.0862 0.2609 0.1304 0.3174 FalseAlarm 0.1333 0.1143 0.5222 0.4874

According to object-based evaluation, the Recall and MissingAlarm of MLE classification results are the worst, with values of 0.6826 and 0.3174, respectively, indicating that this method could not identify or classify the open-pit mines well. The FalseAlarm of SVM is the highest, which shows that the error extractedRemote Sens. by2020 this, 12, methodx FOR PEER is theREVIEW biggest. Among the results of IMRT and the faster R-CNN, IMRT14 of has 19 the best Recall and MissingAlarm, with values of 0.9138 and 0.0862. In the index of FalseAlarm, faster R-CNN performs better, at 0.1143,Table which 2. Precision is a little evaluation lower than based the on IMRT objects model. at 0.0190. Faster R-CNN performs better in terms of FalseAlarm; its value of 0.1143 is slightly lower than that of the IMRT model. IMRT faster R-CNN SVM MLE Of the three types of satellite image samples statistics (Figure 11), the MissingAlarm of Google Earth Precision 0.8667 0.8857 0.4778 0.5125 satellite images and the FalseAlarm of Gaofen-2 are the highest. However, there is a small difference Recall 0.9138 0.7391 0.8696 0.6826 between the MissingAlarm and the FalseAlarm of the three satellite image samples, which indicates that MissingAlarm 0.0862 0.2609 0.1304 0.3174 the IMRT model has good applicability to Gaofen-1, Gaofen-2 and Google Earth satellite images. FalseAlarm 0.1333 0.1143 0.5222 0.4874

Gaofen-1 Gaofen-2 Google Earth 40.00% 37.70% 33.80% 35.20% 35.00% 32.70% 31.00% 29.60% 30.00%

25.00%

20.00%

15.00%

10.00%

5.00%

0.00% MissingAlarm FalseAlarm

Figure 11. MissingAlarm and FalseAlarm based on objects. Figure 11. MissingAlarm and FalseAlarm based on objects.

5.2. OpenTo sum-Pit up,Mine for Dynamic the multi-source Monitoring remote sensing image identification of open-pit mines, the deep learningThe identificationremote sensing method interpretation is better results than the of traditionalHubei Province target show identification that there method are open in-pit precision mines andin the eff ect. key To mining be specific, areas the of Precision the whole, F1 province.and FalseAlarm Amongof the which, faster the R-CNN open- modelpit mines are the in best, western, and IMRTnorthe isrn, better central in andPA, easternKappa coe of Hubeifficient, ProvinceRecall and areMissingAlarm relatively concentrated, which shows, and the that central both the region IMRT is modelcomparatively and the sparse. faster R-CNN There are deep 25 key learning mining model areas have in Yichang a good, ethusffect representing on the target the identification area with the of open-pitmost mines mines. in Hubei Compared Province with, followed faster by R-CNN, IMRT and hasHuan a bettergshi. According performance to the on comprehensiveMissingAlarm, whichinterpretation proves thatof key this mining model areas has a in higher each accuracycity, Huangshi of identification. has the largest Thus, number IMRTis of more open suitable-pit mines. for theJingmen identification and Shiyan of open-pit have the mines most abundant with multi-source interpretation remote types, sensing and, images. on the contrary, Huanggang, and have the least. Based on the Geological Environment Monitoring Report of 130 Key Mining Areas in Hubei Province and the target identification results of IMRT for open-pit mine during 2017–2019, there were 1213 open-pit mines in 2017, 1179 in 2018, and 1114 in 2019. As is shown in Figure 12, in the three-year dynamic monitoring, the number of open-pit mines decreases year by year, while the area of the majority open-pit mines continues to increase. For example, the quarry mine in Shiyan City, the coal mine in Huangshi City and the silica mine in Fang County have been continuously expanding in the past three years. The monitoring results in this paper show that the target boundary of the open-pit mines is continuously expanding. The target recognition study based on the IMRT model has promoted the application of automatic extraction in open-pit mines.

2017 2018 2019 A. Quarry mine in Shiyan City.

Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 19

Table 2. Precision evaluation based on objects.

IMRT faster R-CNN SVM MLE Precision 0.8667 0.8857 0.4778 0.5125 Recall 0.9138 0.7391 0.8696 0.6826 MissingAlarm 0.0862 0.2609 0.1304 0.3174 FalseAlarm 0.1333 0.1143 0.5222 0.4874

Gaofen-1 Gaofen-2 Google Earth 40.00% 37.70% 33.80% 35.20% 35.00% 32.70% 31.00% 29.60% 30.00%

25.00%

20.00%

15.00%

10.00%

5.00%

0.00% Remote Sens. 2020, 12, 3474 15 of 20 MissingAlarm FalseAlarm

Figure 11. MissingAlarm and FalseAlarm based on objects. 5.2. Open-Pit Mine Dynamic Monitoring 5.2.The Open remote-Pit Mine sensing Dynamic interpretation Monitoring results of Hubei Province show that there are open-pit mines in the keyThe mining remote areas sensing of the interpretation whole province. results Among of Hubei which, Province the open-pitshow that mines there inare western, open-pit northern, mines centralin the and key eastern mining of Hubei areas Province of the whole are relatively province. concentrated, Among which, and the the central open-pit region mines is comparatively in western, sparse.northe Therern, central are 25 and key eastern mining of areasHubei in Province Yichang, are thus relatively representing concentrated the area, and with the thecentral most region mines is in Hubeicomparatively Province, followedsparse. There by Xianning are 25 key and mining Huangshi. areas in According Yichang, thus to the representing comprehensive the area interpretation with the ofmost key mining mines in areas Hubei in Province each city,, followed Huangshi by Xianning has the largest and Huan numbergshi. According of open-pit to mines.the comprehensive Jingmen and Shiyaninterpretation have the of most key abundant mining areas interpretation in each city, types, Huangshi and, has on thethe contrary,largest number Huanggang, of open Jingzhou-pit mines. and ShennongjiaJingmen and have Shiyan the have least. the Based most abundant on the Geological interpretation Environment types, and, Monitoring on the contrary, Report Huanggang, of 130 Key MiningJingzhou Areas and in Shennongj Hubei Provinceia have and the theleast. target Based identification on the Geological results Environment of IMRT for open-pitMonitoring mine Report during 2017–2019,of 130 Key there Mining were Areas 1213 in open-pitHubei Province mines and in 2017, the target 1179 identification in 2018, and 1114results in of 2019. IMRT As for is open shown-pit in Figuremine 12 during, in the 2017 three-year–2019, there dynamic were monitoring,1213 open-pit the mines number in 2017, of open-pit 1179 in 2018, mines and decreases 1114 in 2019. year byAs year,is whileshown the in area Figure of the 12 majority, in the three open-pit-year dynamic mines continues monitoring, to increase. the number For of example, open-pit themines quarry decreases mine in Shiyanyear City,by year, the while coal mine the area in Huangshi of the majority City and open the-pit silica mines mine continues in Fang Countyto increase. have For been example, continuously the expandingquarry mine in the in Shiyan past three City, years. the coal The min monitoringe in Huangshi results City in and this the paper silica show mine that in Fang the target County boundary have ofbeen the open-pit continuously mines expanding is continuously in the past expanding. three years. The The target monitoring recognition results study in this based paper on show the that IMRT the target boundary of the open-pit mines is continuously expanding. The target recognition study model has promoted the application of automatic extraction in open-pit mines. based on the IMRT model has promoted the application of automatic extraction in open-pit mines.

2017 2018 2019 Remote Sens. 2020, 12, x FOR PEER REVIEW A. Quarry mine in Shiyan City. 15 of 19

2017 2018 2019 B. Coal mine in Huangshi City.

2017 2018 2019 C. Silica mine in Fang County.

FigureFigure 12. 12.Open-pit Open-pit mine changes from from 2017 2017 to to 2019. 2019.

5.3. Assessment of Mine Environment Damages The characteristics and intensity of mines’ geological environment problems are closely related to mines’ geological environment background and topographical landscape damages. Therefore, the damages rate of the topographical landscape (DROTL) is an important index to demonstrate the influence of mines’ geological environments. In this paper, three levels are used to evaluate the impact degree of topographical landscapes. Level I represents the DROTL when it is above 40%; level II represents the DROTL when it is between 20% and 40%; and level III represents the DROTL when it is less than 20%. The evaluation formula is as follows:

i ∑0 Uk DROTL = j × 100% (15) ∑ 0 Um

where Uk is the topographical landscape damaged area and Um is the open-pit mining area. The damages of topographical landscape are evaluated with the recognition results of IMRT and the remote sensing interpretation results. As shown in Figure 13, the total number of key mining areas in Shennongjia is two, representing the smallest number. As we can see, the DROTL level of all the key mining areas is Ⅲ, showing that the damage degree of this area is relatively slight. The maximum number of key mining area is 24 in Yichang, and the DROTL levels are mostly Ⅲ, with only 1 for level I. This indicates that the mines in this region are in reasonable exploitation. The topographical landscapes in Huangshi, Xianning and Jingmen are seriously damaged, and with 75.0%, 68.8% and 53.8% of the topographical landscape destruction in their regions, respectively, they are represented by level I. A total of 36.2% of topographical landscape damage to the key mining area reached level I in Hubei Province, which shows that mineral exploitation has caused serious damage to the topographical landscape. In the remote sensing interpretation of mines’ geological environments, the level of land occupation and destruction (Table 3) is mainly evaluated through the degree of occupation and destruction of farmland, forest land (or grassland) and unused land. As shown in Table 4, in the evaluation, there are 45 key mining areas in level I, 37 key mining areas in level II, 44 key mining areas in level III, and 4 key mining areas without mining land occupation or destruction. Level I of

Remote Sens. 2020, 12, 3474 16 of 20

5.3. Assessment of Mine Environment Damages The characteristics and intensity of mines’ geological environment problems are closely related to mines’ geological environment background and topographical landscape damages. Therefore, the damages rate of the topographical landscape (DROTL) is an important index to demonstrate the influence of mines’ geological environments. In this paper, three levels are used to evaluate the impact degree of topographical landscapes. Level I represents the DROTL when it is above 40%; level II represents the DROTL when it is between 20% and 40%; and level III represents the DROTL when it is less than 20%. The evaluation formula is as follows:

Pi Uk DROTL = 0 100% (15) Pj × 0 Um where Uk is the topographical landscape damaged area and Um is the open-pit mining area. The damages of topographical landscape are evaluated with the recognition results of IMRT and the remote sensing interpretation results. As shown in Figure 13, the total number of key mining areas in Shennongjia is two, representing the smallest number. As we can see, the DROTL level of all the key mining areas is III, showing that the damage degree of this area is relatively slight. The maximum number of key mining area is 24 in Yichang, and the DROTL levels are mostly III, with only 1 for level I. This indicates that the mines in this region are in reasonable exploitation. The topographical landscapes in Huangshi, Xianning and Jingmen are seriously damaged, and with 75.0%, 68.8% and 53.8% of the topographical landscape destruction in their regions, respectively, they are represented by level I. A total of 36.2% of topographical landscape damage to the key mining area reached level I in Hubei Province, which shows that mineral exploitation has caused serious damage to the topographical landscape. In the remote sensing interpretation of mines’ geological environments, the level of land occupation and destruction (Table3) is mainly evaluated through the degree of occupation and destruction of farmland, forest land (or grassland) and unused land. As shown in Table4, in the evaluation, there are 45 key mining areas in level I, 37 key mining areas in level II, 44 key mining areas in level III, and 4 key mining areas without mining land occupation or destruction. Level I of land occupation and destruction represents a total of 34.62%, which is close to the amount of key mining areas facing topographical landscape damage.

Table 3. The scale of land occupation and destruction evaluation.

III II I Farmland 10 >10 ≤ Forest and Grassland 4 4–10 >10 ≤ Unused land 40 40–100 >100 ≤ Unit: hm2.

Table 4. The level of land occupation and destruction of cities in Hubei.

None III II I Number 4 44 37 45 Number ratio (%) 3.08 33.84 28.46 34.62

Large-scale and high-intensity development of mines directly lead to geological environment problems such as destruction of water resource balance and reduction of vegetation coverage and land occupation. It is necessary for departments to exercise their functions, regulate the mineral resource exploitation activities and ensure the orderly exploitation of mineral resources. Through mine Remote Sens. 2020, 12, x FOR PEER REVIEW 16 of 19 land occupation and destruction represents a total of 34.62%, which is close to the amount of key mining areas facing topographical landscape damage. Large-scale and high-intensity development of mines directly lead to geological environment problemsRemote Sens. such2020, 12 as, 3474 destruction of water resource balance and reduction of vegetation coverage17 and of 20 land occupation. It is necessary for departments to exercise their functions, regulate the mineral resource exploitation activities and ensure the orderly exploitation of mineral resources. Through monitoring, it is of great practical significance to provide accurate and reliable data for departments mine monitoring, it is of great practical significance to provide accurate and reliable data for and realize the coordinated development of mines and the ecological environment. departments and realize the coordinated development of mines and the ecological environment.

30

25

20

15

10

5

0 Ⅲ Ⅱ Ⅰ SUM Ezhou 2 1 1 4 Enshi 7 2 1 10 Huanggang 4 0 8 12 Huangshi 4 5 5 14 Jingmen 5 1 7 13 Shennongjia 2 0 0 2 Shiyan 5 2 5 12 Suizhou 2 0 1 3 Wuhan 0 0 4 4 Xianning 2 3 11 16 7 0 1 8 5 1 2 8 Yichang 21 2 1 24 Figure 13. The damages rate of the topographical landscape of cities in Hubei. Figure 13. The damages rate of the topographical landscape of cities in Hubei.

6. Conclusions Table 3. The scale of land occupation and destruction evaluation. Deep learning provides an approach to learn effectiveⅢ featuresⅡ automaticallyⅠ from a training set, and it can perform unsupervised characteristicFarmland learning from≤10 an enormous>10 original image dataset. In view of the low extraction accuracy,Forest and low Grassland efficiency and≤4 low4 degree–10 of>10 automation of remote sensing images of mines by traditional methods,Unused land we proposed ≤40 an 40 open-pit–100 > mine100 extraction model bases on Improved Mask R-CNN and Transfer learningUnit: (IMRT), hm2. constructed a set of multi-source mine sample databases consisting of Gaofen-1, Gaofen-2 and Google Earth satellite images with a resolution of two meters, and designedTable an4. The automatic level of land batch occupation production and processdestruction of open-pitof cities in mineHubei. targets in order to automatically identify and dynamically monitorNone open-pit Ⅲ mines. Ⅱ The mainⅠ conclusions are as follows: (1) The experiment results showNumber that the IMRT model4 is44 superior 37 to traditional45 methods in precision, generalization, automationNumber and ratio effi ciency.(%) 3.08 At the same33.84 time,28.46 this 34.62 model has a good applicability for Gaofen-1, Gaofen-2 and Google Earth satellite images, which expands the data source of open-pit mines and enhances the practicability of model.

(2) Remote sensing images are used to identify the open-pit mines in Hubei Province from 2017 to 2019 automatically. By analyzing the target recognition results of IMRT, it is shown that although Remote Sens. 2020, 12, 3474 18 of 20

the number of open-pit mines is slowly decreasing, some of the key mining areas are increasing. Part of the open-pit mines is constantly expanding. (3) Level I (serious) of land occupation and destruction accounts for 34.62%, and 36.2% of topographical landscape damage reached level I, which shows that the mineral exploitation has caused serious damage to the topographical landscape. It is necessary for the departments to regulate mineral resource exploitation activities and ensure the orderly exploitation of mineral resources.

Author Contributions: Conceptualization, C.W.; methodology, C.W. and L.C.; software, C.W. and L.C.; validation, C.W.; formal analysis, C.W., L.C. and L.Z.; resources, R.N.; writing—original draft, C.W.; writing—review and editing, C.W. and L.Z. All authors have read and agreed to the published version of the manuscript. Funding: This research received no external funding. Acknowledgments: This work is funded by Hubei Geological Environment Station. The vector data of 130 key mines in Hubei Province as well as Gaofen-1 and Gaofen-2 satellite images are mainly provided by the Hubei Data and Application Center. Conflicts of Interest: The authors declare no conflict of interest.

References

1. Pohl, C.; van Genderen, J. Multisensor image fusion in remote sensing: Concepts, methods and applications. Int. J. Remote Sens. 1998, 19, 823–854. [CrossRef] 2. Ji, S.; Tian, S.; Zhang, C. Urban Land Cover Classification and Change Detection Using Fully Atrous Convolutional Neural Network. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 233–241. 3. Zainuddin, M.; Kiyofuji, H.; Saitoh, K.; Saitoh, S. Using multi-sensor satellite remote sensing and catch data to detect ocean hot spots for albacore (Thunnus alalunga) in the northwestern North Pacific. Deep-Sea Res. Part II 2006, 53, 419–431. [CrossRef] 4. Xiao, W.; Zhang, W.; Lyu, X.; Wang, X. Spatio-temporal patterns of ecological capital under different mining intensities in an ecologically fragile mining area in Western China: A case study of Shenfu mining area. J. Nat. Res. 2020, 35, 68–81. 5. Wang, L.; Jin, X.; Jia, H.; Tang, Y.; Ma, G. Change detection for mine environment based on domestic high resolution satellite images. Remote Sens. Land Resour. 2018, 30, 151–158. 6. Nguyen, A.; Liou, Y.; Li, M.; Tran, T. Zoning eco-environmental vulnerability for environmental management and protection. Ecol. Indic. 2016, 69, 100–117. [CrossRef] 7. Asif, Z.; Chen, Z. An integrated optimization and simulation approach for air pollution control under uncertainty in open-pit metal mine. Front. Environ. Sci. Eng. 2019, 13, 2095–2201. [CrossRef] 8. He, F.; Xu, Y.; Chen, H.; Zhang, J. Present status of mine geohazards in the northwest region of China and characteristics of their temporal-spatial distribution. Geol. Bull. China 2008, 27, 1245–1255. 9. Gong, P.; Liu, H.; Zhang, M.; Li, C.; Wang, J. Stable classification with limited sample: Transferring a 30-m resolution sample set collected in 2015 to mapping 10-m resolution global land cover in 2017. Sci. Bull. 2019, 64, 370–373. [CrossRef] 10. Diao, M.; Liu, F.; Tan, Z.; Xue, T.; Wang, Y. Research and implement on automatic production method of mine remote sensing monitoring interpretation record table. Remote Sens. Land Resour. 2018, 30, 212–217. 11. Alborzi, N.; Poorahangaryan, F.; Beheshti, H. Spectral-spatial Classification of Hyperspectral Images Using Signal Subspace Identification and Edge-preserving Filter. Int. J. Autom. Comput. 2020, 17, 222–232. [CrossRef] 12. Huang, X.; Zhang, L.; Li, P. Classification of High Spatial Resolution Remotely Sensed Imagery Based Upon Fusion of Multiscale Features and SVM. J. Remote Sens. 2007, 11, 48–54. 13. Hou, W.; Lu, X.; Zhang, C.; Wang, J. Object-oriented Information Extraction from High Resolution Imagery: A Case Study for Recognition of Residential Area in Lixian County, Sichuan Province. J. Geo-Inf. Sci. 2010, 12, 119–125. 14. Kang, J.; Zhou, L.; Zhao, D.; Wen, X. High Resolution Remote Sensing Image Object Recognition Method Based on Histogram Feature Knowledge Base. Chin. Rare Earth 2020, 41, 32–40. Remote Sens. 2020, 12, 3474 19 of 20

15. Sampath, A.K.; Gomathi, N. Decision tree and deep learning based probabilistic model for character recognition. J. Cent. South Univ. 2017, 24, 2862–2876. [CrossRef] 16. Yu, K.; Jia, L.; Chen, Y.; Xu, W. Deep Learning: Yesterday, Today, and Tomorrow. J. Comput. Resour. Dev. 2013, 50, 1799–1804. 17. Dang, Y.; Zhang, J.; Deng, K.; Zhao, Y.; Yu, F. Study on the Evaluation of Land Cover Classification using Remote Sensing Images Based on AlexNet. J. Geo-Inf. Sci. 2017, 19, 1530–1537. 18. Wang, L.; Li, J.; Zhou, G.; Yang, D. Application of Deep Transfer Learning in Hyperspectral Image Classification. Comput. Eng. A 2019, 55, 181–186. 19. Clabaut, É.; Lemelin, M.; Germain, M.; Williamson, M.; Brassard, É. A Deep Learning Approach to the Detection of Gossans in the Canadian Arctic. Remote Sens. 2020, 12, 3123. [CrossRef] 20. Cai, H.; Xu, Y.; Li, Z.; Cao, H.; Feng, Y.; Chen, S.; Li, Y. The division of metallogenic prospective areas based on convolutional neural network model: A case study of the Daqiao gold polymetallic deposit. Geol. Bull. China 2019, 38, 1999–2009. 21. Yang, H.; Zhao, Y.; Dong, J. Remote sensing image super-resolution of open-pit mining area based on texture transfer. J. China Coal Soc. 2019, 44, 3781–3789. 22. Xiang, Y.; Zhao, Y.; Dong, J. Remote sensing image mining area change detection based on improved UNet siamese network. J. China Coal Soc. 2019, 44, 3773–3780. 23. Wu, Z.; Lei, S.; Lu, Q.; Bian, Z.; Ge, S. Spatial distribution of the impact of surface mining on the landscape ecological health of semi-arid grasslands. Ecol. Indic. 2020, 111, 105996. [CrossRef] 24. Slavomir, L.; Marcela, B.; Zofia, K.; Stefan, K.; Gabriel, F.; Vieroslav, M. Utilization of Geodetic Methods Results in Small Open-Pit Mine Conditions: A Case Study from Slovakia. Minerals 2020, 10, 489–510. 25. Gong, C.; Lei, S.; Bian, Z.; Liu, Y.; Zhang, Z.; Cheng, W. Analysis of the Development of an Erosion Gully in an Open-Pit Coal Mine Dump During a Winter Freeze-Thaw Cycle by Using Low-Cost UAVs. Remote Sens. 2019, 11, 1365. [CrossRef] 26. Zhang, Y.; Shen, W.; Li, M.; Lv, Y. Integrating Landsat Time Series Observations and Corona Images to Characterize Forest Change Patterns in a Mining Region of Nanjing, Eastern China from 1967 to 2019. Remote Sens. 2020, 12, 3191. [CrossRef] 27. Wen, Y.; Li, L.; Shang, X.; Hu, F. An Instance Segmentation Method based on Improved MASK RCNN Feature Fusion. Comput. A Softw. 2019, 36, 130–133. 28. Lu, H.; Fu, X.; He, Y.; Li, L.; Zhuang, W.; Liu, T. Cultivated Land Information Extraction from High Resolution UAV Images Based on Transfer Learning. Trans. Chin. Soc. Agric. Mach. 2015, 46, 274–279, 284. 29. Wang, S.; Zhang, Z.; Zhao, X.; Zhou, Q. Eco-environmental synthetic analysis based on RS and GIS technology in Hubei province. Adv. Earth Sci. 2002, 17, 426–431. 30. Zhou, T.; Fan, Y.; Yuan, F. Advances on petrogensis and metallogeny study of the mineralization belt of the Middle and Lower Reaches of the Yangtze River area. Acta. Petrol. Sin. 2008, 24, 1665–1678. 31. Zhang, W.; Witharana, C.; Liljedahl, A.K.; Kanevskiy,M. Deep Convolutional Neural Networks for Automated Characterization of Arctic Ice-Wedge Polygons in Very High Spatial Resolution Aerial Imagery. Remote Sens. 2018, 10, 1487. [CrossRef] 32. Kalfarisi, R.; Wu, Z.; Soh, K. Crack Detection and Segmentation Using Deep Learning with 3D Reality Mesh Model for Quantitative Assessment and Integrated Visualization. J. Comput. Civil. Eng. 2020, 34, 04020010. [CrossRef] 33. Machefer, M.; Lemarchand, F.; Bonnefond, V.; Hitchins, A.; Sidiropoulos, P. Mask R-CNN Refitting Strategy for Plant Counting and Sizing in UAV Imagery. Remote Sens. 2020, 12, 3015. [CrossRef] 34. Schaap, M.; Leij, F.; van Genuchten, M. Neural network analysis for hierarchical prediction of soil hydraulic properties. Soil Sci. Soc. Am. J. 1998, 62, 847–855. [CrossRef] 35. Panboonyuen, T.; Jitkajornwanich, K.; Lawawirojwong, S.; Srestasathiern, P.; Vateekul, P. Semantic Labeling in Remote Sensing Corpora Using Feature Fusion-Based Enhanced Global Convolutional Network with High-Resolution Representations and Depthwise Atrous Convolution. Remote Sens. 2020, 11, 1233. [CrossRef] 36. Sizkouhi, A.; Aghaei, M.; Esmailifar, S.; Mohammadi, M.; Grimaccia, F. Automatic Boundary Extraction of Large-Scale Photovoltaic Plants Using a Fully Convolutional Network on Aerial Imagery. IEEE J. Photovolt. 2020, 10, 1061–1067. [CrossRef] 37. Almubarak, H.; Bazi, Y.; Alajlan, N. Two-Stage Mask-RCNN Approach for Detecting and Segmenting the Optic Nerve Head, Optic Disc, and Optic Cup in Fundus Images. Appl. Sci. 2020, 10, 3833. [CrossRef] Remote Sens. 2020, 12, 3474 20 of 20

38. Li, D.; He, W.; Guo, B.; Li, M.; Chen, M. Building target detection algorithm based on Mask-RCNN. Sci. Surv. Mapp. 2019, 44, 172–180. 39. Shi, J.; Zhou, Y.; Zhang, Q. Service robot item recognition system based on improved Mask RCNN and Kinect. Chin. J. Sci. Instrum. 2019, 40, 216–228. 40. Zhuang, F.; Luo, P.; He, Q.; Shi, Z. Survey on Transfer Learning Research. J. Softw. 2015, 26, 26–39. 41. Hu, F.; Xia, G.; Hu, J.; Zhang, L. Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery. Remote Sens. 2015, 7, 14680–14707. [CrossRef] 42. Zhang, B.; Shi, Z.; Zhao, X.; Zhang, J. A Transfer Learning Based on Canonical Correlation Analysis Across Different Domains. Chin. J. Comput. 2015, 38, 1326–1336. 43. Zlateski, A.; Jaroensri, R.; Sharma, P.; Durand, F. On the Importance of Label Quality for Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1479–1487. 44. Wu, T.; Luo, J.; Xia, L.; Yang, H.; Shen, Z.; Hu, X. An Automatic Sample Collection Method for Object-oriented Classification of Remotely Sensed Imageries Based on Transfer Learning. Acta Geol. Cartogr. Sin. 2014, 43, 908–916. 45. Li, C.; Lu, L.; Bi, M.; Wang, M.; Zhao, G. Sythetic information ore-prospecting model for gold minerals of Baishan area, Jilin. World Geol. 2010, 29, 71–77. 46. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D. Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge. IEEE Trans. Pattern Anal. 2017, 39, 652–663. [CrossRef] 47. Xu, C.; Wang, X.; Yang, Y.Attention-YOLO: YOLO Detection Algorithm That Introduces Attention Mechanism. Comput. Eng. A. 2019, 55, 13–23, 125. 48. Lv, Y.; Zhang, C.; Yun, W.; Gao, L.; Wang, H.; Ma, J.; Li, H.; Zhu, D. The Delineation and Grading of Actual Crop Production Units in Modern Smallholder Areas Using RS Data and Mask R-CNN. Remote Sens. 2020, 12, 1074. [CrossRef] 49. Song, X.; Jiao, L. Classification of Hyperspectral Remote Sensing Image Based on Sparse Representation and Spectral Information. J. Electron. Inf. Technol. 2012, 34, 268–272. 50. Li, Q.; Liu, X.; Li, M.; Niu, J.; Li, R. A target detection algorithm of HFSWR based on redundant discrete wavelet transform. Chin. J. Radio Sci. 2016, 31, 501–507. 51. Duan, R.; Zhao, W.; Huang, S.; Chen, J. Fast line detection algorithm based on improved Hough transformation. Chin. J. Sci. Instrum. 2010, 31, 2774–2780. 52. Lin, Y.; Zhang, B.; Xu, J.; Hou, K.; Zhou, X. Building Extraction from High Resolution Remote Sensing Imagery with Multi-feature and Multi-scale. Bull. Surv. Mapp. 2017, 12, 53–57.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).