<<

minerals

Article A Multi-Convolutional Approach to Multivariate Geochemical Anomaly Recognition

Lirong Chen 1,2, Qingfeng Guan 1,2,* , Bin Feng 3, Hanqiu Yue 1,2, Junyi Wang 1,2 and Fan Zhang 4

1 School of Geography and Information , China University of Geosciences (Wuhan), Wuhan 430074, China; [email protected] (L.C.); [email protected] (H.Y.); [email protected] (J.W.) 2 National Engineering Research Center of GIS, China University of Geosciences (Wuhan), Wuhan 430074, China 3 Institute of Geophysical and Geochemical Exploration, China Academy of Geological Science, Langfang 065000, China; [email protected] 4 Faculty of Resources and Environmental Science, Hubei University, Wuhan 430062, China; [email protected] * Correspondence: [email protected]

 Received: 8 February 2019; Accepted: 27 April 2019; Published: 30 April 2019 

Abstract: The spatial structural patterns of geochemical backgrounds are often ignored in geochemical anomaly recognition, leading to the ineffective recognition of valuable anomalies in geochemical prospecting. In this contribution, a multi-convolutional autoencoder (MCAE) approach is proposed to deal with this issue, which includes three unique steps: (1) a whitening process is used to minimize the correlations among geochemical elements, avoiding the diluting of effective background information embedded in redundant data; (2) the Global Moran’s I index is used to determine the recognition domain of the background spatial structure for each element, and then the domain is used for convolution window size setting in MCAE; and (3) a multi-convolutional autoencoder framework is designed to learn the spatial structural pattern and reconstruct the geochemical background of each element. Finally, the anomaly score at each sampling location is calculated as the difference between the whitened geochemical features and the reconstructed features. This method was applied to the southwestern Fujian Province metalorganic belt in China, using the concentrations of Cu, Mn, Pb, Zn, and Fe2O3 measured from stream sediment samples. The results showed that the recognition domain determination greatly improved the quality of anomaly recognition, and MCAE outperformed several existing methods in all aspects. In particular, the anomalies from MCAE were the most consistent with the known Fe deposits in the area, achieving an area under the curve (AUC) of 0.89 and a forecast area of 17%.

Keywords: multivariate geochemical anomalies; convolutional autoencoder; spatial structure; global Moran’s I

1. Introduction Geochemical anomaly recognition is a key task in mineral exploration [1,2] for the alleviation of the current shortages of mineral resources [3]. A large number of geochemical anomaly recognition methods have been developed in the past few decades [4–6]. The recognition of geochemical anomalies is to discover the prospection information contained in geochemical data by identifying the anomalies deviating from the normal geochemical samples (i.e., geochemical background) [3]. Full exploitation of geochemical information can improve the performance of anomaly recognition. Regional geochemical

Minerals 2019, 9, 270; doi:10.3390/min9050270 www.mdpi.com/journal/minerals Minerals 2019, 9, 270 2 of 23 samples demonstrate clear signs of spatial /dependence, meaning that they are not independent of each other [7,8]. The spatial structure of a geochemical element is influenced by the combination of a variety of geological processes, including diagenesis and mineralization [9,10]. Therefore, the analysis of spatial structure information derived from geochemical data can facilitate the study on regional metallogenic mechanism and the recognition of geochemical anomalies for ore prospection. Geochemical anomaly recognition methods that consider the spatial distribution of geochemical samples include spatial analysis methods (e.g., technique and spatial fractal analysis) [8,11–13] and geostatistical methods (e.g., geographically weighted principal components analysis, Kriging, multivariate , and inverse distance-weighted regression) [14–16]. These methods facilitate the detection and understanding of geochemical spatial patterns and local spatial structures. In 2006, the (DL) model proposed by Hinton and colleagues inspired a surge in artificial neural network (ANN) application research [17]. A variety of ANNs, such as deep belief network (DBN) and convolutional neural network (CNN) have been successfully applied in various fields, such as [18], image recognition [19], and information retrieval [20]. In the field of geochemical exploration, DL models have also been applied for geochemical anomaly recognition [3,21,22]. For example, DBN has been used to extract the characteristics of the relationships between the elements related to the Fe ore, and to reconstruct the backgrounds to separate anomalies from the background [23]. Chen et al. used the continuous restricted Boltzmann machine (CRBM) model to learn the relationship between known deposits and an evidence map [24]. In short, applications of ANN/DL models in geochemical prospecting have only focused on the relationships among the geochemical elements and/or the correlations between known deposits and geochemical elements. The spatial distributions of geochemical elements have rarely been taken into account in the existing ANN/DL-based models. CNN, a typical DL model, is capable of extracting the patterns of spatial arrangement of features [25–27]. Therefore, CNN has become a widely used technique for image such as handwriting and facial recognition [28,29]. A classical CNN combines multiple convolution layers and pooling layers to create a map between the input features and output targets [30]. A convolution extracts local spatial patterns, and a pooling layer reduces the amount of data while retaining useful information. CNN has been used for outlier detection of spatial data [31,32]. Convolutional autoencoder (CAE) is an autoencoder (AE) neural network that uses convolution layers and pooling layers to extract the hidden patterns of input features (i.e., encoding), and layers and unpooling layers to reconstruct the features from the hidden patterns (i.e., decoding). By integrating convolutional and deconvolutional layers in an AE structure, CAE is capable of learning the spatial structure of input features, and reconstructing these features while taking into account their spatial structural patterns [33–35]. Chen et al. used a CAE model to achieve the unsupervised extraction of spatial structural patterns of color images [36]. Baccouche et al. achieved local two-dimensional (2D) feature acquisition using CAE for video sequence classification [37]. Chen et al. measured the similarity of images using CAE for medical image analysis [38]. Given that the geochemical background is represented by the majority of geochemical samples and the much fewer anomalies pose little impact on the general spatial pattern of geochemistry, CAE provides a promising approach to reconstruct the geochemical background by learning the spatial structural pattern of samples, and can thus identify those anomalous samples by their relatively larger differences from the reconstructed values. In regional geochemical explorations, the concentrations of multiple elements are often measured at each sampling location. Multivariate analysis usually provides more robust and indicative results by identifying the combined anomalies of multiple elements, because the source or process that has generated the anomalies commonly has an association with a suite of elements (i.e., target and pathfinder elements) [39]. When applying CAE in the anomaly recognition of multiple geochemical elements in a region, two issues must be dealt with: (1) the geochemical backgrounds of different elements can be quite different [1,40]; therefore, it is necessary to reconstruct the backgrounds of elements separately in Minerals 2019, 9, 270 3 of 23 order to avoid mutual interference. Furthermore, multiple elements at a certain location are inevitably correlated [41], leading to redundant information contained in the raw multivariate geochemical data. SuchMinerals redundancy 2019, 9, x FOR must PEER be REVIEW minimized through a pre-processing procedure such that the background3 of of23 each pre-processed element can be effectively learned and reconstructed. (2) For a geochemical element, patterns within different distance ranges. Within a small range, the concentration values are often the spatial autocorrelation exhibits various patterns within different distance ranges. Within a small highly similar with each other, exhibiting a clustered spatial pattern. As the range increases, the range, the concentration values are often highly similar with each other, exhibiting a clustered spatial degree of clustering decreases and the spatial distribution changes towards a random pattern. pattern. As the range increases, the degree of clustering decreases and the spatial distribution changes Therefore, the size of convolution window has a great impact on the effectiveness of spatial pattern towards a random pattern. Therefore, the size of convolution window has a great impact on the learning [28], and thus on the performance of anomaly recognition. Currently, there is no unified effectiveness of spatial pattern learning [28], and thus on the performance of anomaly recognition. method, and the window size is usually set based on experience [42]. A method is needed for Currently, there is no unified method, and the window size is usually set based on experience [42]. determining the proper convolutional window size based on the quantitative analysis of spatial A method is needed for determining the proper convolutional window size based on the quantitative autocorrelation ranges of geochemical elements. analysis of spatial autocorrelation ranges of geochemical elements. In this study, a multi-convolutional autoencoder (MCAE) approach to multivariate geochemical In this study, a multi-convolutional autoencoder (MCAE) approach to multivariate geochemical anomaly recognition is proposed. Specifically, a whitening process based on the zero-phase anomaly recognition is proposed. Specifically, a whitening process based on the zero-phase component component analysis (ZCA) is used to minimize the correlations among geochemical elements. In a analysis (ZCA) is used to minimize the correlations among geochemical elements. In a multi-CAE multi-CAE model, each CAE is trained using the spatial distribution map of a corresponding model, each CAE is trained using the spatial distribution map of a corresponding whitened element, whitened element, and generates a reconstructed map that represent the geochemical background of and generates a reconstructed map that represent the geochemical background of such element. For the such element. For the convolution window size, the Global Moran’s I, a measurement of spatial convolution window size, the Global Moran’s I, a measurement of spatial correlation that has been used correlation that has been used to explore the spatial correlation range of the mineralization elements to explore the spatial correlation range of the mineralization elements [43], is used to determine the size [43], is used to determine the size of recognition domain (i.e., convolution window) of each whitened of recognition domain (i.e., convolution window) of each whitened element. Finally, the anomaly score element. Finally, the anomaly score at each sampling location is calculated as the difference at each sampling location is calculated as the difference (measured by Euclidean distance) between the (measured by Euclidean distance) between the original values of whitened elements and the original values of whitened elements and the reconstructed values, and an anomaly map of the region reconstructed values, and an anomaly map of the region is produced. is produced. The MCAE was applied to the stream sediment survey data of the southwestern Fujian province, The MCAE was applied to the stream sediment survey data of the southwestern Fujian province, which is one of the most important iron polymetallic metallogenic regions in China. The results which is one of the most important iron polymetallic metallogenic regions in China. The results showed showed that MCAE outperformed several existing methods in all aspects. Additionally, setting the that MCAE outperformed several existing methods in all aspects. Additionally, setting the convolution convolution window size using Global Moran’s I effectively helped improve the anomaly recognition window size using Global Moran’s I effectively helped improve the anomaly recognition accuracy in accuracy in the experiments. the experiments.

2.2. Multi-ConvolutionalMulti-convolutional Autoencoder Autoencoder (MCAE) (MCAE) Approach Approach AsAs shownshown inin FigureFigure1 ,1, the the proposed proposed MCAE MCAE includes includes the the following following steps. steps. Firstly,Firstly, a a whitening whitening algorithmalgorithm isis used used to to reduce reduce the the correlations correlations among among multiple multiple geochemical geochemical elements. elements. Secondly, Secondly, the Global the Moran’sGlobal Moran’s I values I of values each element of each withelement diff erentwith spatialdifferent ranges spatial (also ranges termed (also bandwidths) termed bandwidths) are calculated, are andcalculated, the convolution and the windowconvolution size window for an element size for is determinedan element basedis determined on the Moran’s based on I values. the Moran’s Thirdly, I multiplevalues. Thirdly, CAEs are multiple used to CAEs learn andare reconstructused to learn the and backgrounds reconstruct of the multiple backgrounds elements of separately. multiple Finally,elements the separately. anomaly score Finally, at each the sampling anomaly location score isat calculated each sampling as the multivariate location is dicalculatedfference between as the whitenedmultivariate values difference and reconstructed between whitened values, and values a geochemical and reconstructed anomalymap values, is generated. and a geochemical anomaly map is generated.

FigureFigure 1.1. Flow chart ofof thethe multi-convolutionalmulti-convolutional autoencoderautoencoder (MCAE).(MCAE). 2.1. Decorrelation of Geochemical Elements 2.1. Decorrelation of Geochemical Elements In the MCAE, the geochemical background of each element is learned and reconstructed by aIn CAE the separately.MCAE, the Thegeochemical correlations background among elements of each element need to is be learned minimized and reconstructed to avoid mutual by a interference.CAE separately. The The decorrelation correlations of amon multipleg elements variables need isto alsobe minimized termed whitening, to avoid mutual and the interference. common whiteningThe decorrelation methods of include multiple principal variables component is also analysis termed (PCA)whitening, and zero-phase and the componentcommon whitening analysis (ZCA)methods [44 ,include45]. The principal basic principle component of ZCA analysis whitening (PCA) is to and rotate zero-phase the results component of PCA whitening analysis so(ZCA) that [44,45]. The basic principle of ZCA whitening is to rotate the results of PCA whitening so that the data dimension after whitening is closer to the original data. ZCA whitening is characterized by a whitened variable that remains highly correlated with the original variable, maintaining its original interpretation [46]. PCA whitening has better compression capability, and the whitening variable has no interpretable relation to the original data [47]. To avoid the loss of the original meaning of the variables, ZCA whitening is used in MCAE.

Minerals 2019, 9, 270 4 of 23 the data dimension after whitening is closer to the original data. ZCA whitening is characterized by a whitened variable that remains highly correlated with the original variable, maintaining its original interpretation [46]. PCA whitening has better compression capability, and the whitening variable has no interpretable relation to the original data [47]. To avoid the loss of the original meaning of the variables, ZCA whitening is used in MCAE. In ZCA, n samples with m variables are formed into an n m matrix Ψ. Through the eigenvalue × decomposition of the covariance matrix C = cov(Ψ), the eigenvalues and the eigenvectors are obtained, and then the input data is scaled with the eigenvalues factor:

(Ψ mean(Ψ)) V ΨPCAwhite = p− · (1) diag(D) + ξ where ξ is the whitening factor, mean(Ψ) is the matrix averaging , D is a diagonal matrix containing the eigenvalues, and V is a matrix whose columns correspond to the appropriate eigenvectors. To avoid numerical instability or data overflow caused by the eigenvalue diag(D) approaching 0, ξ can be set as a small normal number. Based on this, Equation (2) is used to for further processing of PCA whitening results to obtain element values close to the original data and with low correlation:

Ψ = Ψ VT (2) ZCAwhite PCAwhite· 2.2. Determination of the Recognition Domain of the Background Spatial Structure The concentrations of geochemical elements are the results of the superposition of multiple geological processes and often exhibit certain forms of spatial structures, mainly influenced by the spatial of elements [48,49]. Within a certain distance, the change of element concentration in space shows a degree of regularity, and the spatial structure can provide key information for the recognition of geochemical background and anomalies. Beyond this distance, randomness becomes dominant in the spatial distribution of element concentrations, where weak regularity or no regularity is found [50]. In geochemical data analysis, the method based on spatial autocorrelation theory, such as the Global Moran’s I, can reveal the spatial structure of regional geochemical variables [51–53]. Generally, the Global Moran’s I is between 1.0 and 1.0. A positive correlation among spatial samples is found − when the Moran’s I is close to 1, while a negative correlation is found when I is close to 1. Both indicate − regular spatial changes of observed values, that is, a spatial structure. A value of 0 or close to 0 means that the observed values are randomly distributed in space and no spatial structure is found [54,55]. Both the background spatial structure and anomaly spatial structure can be found for a certain element. They have different recognition domains (Figure2). Generally, background recognition domains are larger than anomaly recognition domains. In the background recognition domains, the spatial structure is stable, meaning the structure does not change greatly with the increase of recognition domain. For example, as shown in Figure2, as the domain increases from 45 km × 45 km to 50 km 50 km, the spatial structure still shows a linear pattern. On the other hand, × the anomaly shows the opposite behavior and the spatial structure changes greatly with the increase of the recognition domain. In MCAE, the Global Moran’s I is used to the degrees of spatial autocorrelation among samples within different distance bands, and eventually to determine the recognition domain of a whitened element. If the distance between two samples is less than a distance band, these two samples are considered to be adjacent in the Global Moran’s I. The specific process is as follows: First, the Global Moran’s I values with different distance bands are calculated for a certain whitened element, and a relationship curve between the I values and the distance bands is generated. The I value usually decreases as the distance band increases, since concentrations of an element are often highly similar within a short distance and the similarity reduces as the distance is larger. After the dropping in the early stages, the relationship curve gradually becomes stable as the distance band increases. Minerals 2019, 9, 270 5 of 23

Second, the distance bands corresponding to the stable section of the curve with I values greater than 0 are selected. As mentioned before, a smaller absolute value of Moran’s I corresponds to stronger randomness. To capture the background spatial structure of the element and avoid the interference of randomness, the distance band of the maximum absolute value of Moran’s I is selected from prior selected distance bands as the background recognition domains. Minerals 2019, 9, x FOR PEER REVIEW 5 of 23

Figure 2. Spatial structures under different recognition domains. Figure 2. Spatial structures under different recognition domains.

InThe MCAE, Moran’s the Global I is defined Moran’s as: I is used to measure the degrees of spatial autocorrelation among samples within different distance bands, and eventually to determine the recognition domain of a Pn Pn   whitened element. If the distance betweenn i=1 twoj=1 sampleswij(xi x ism) lessxj thanxm a distance band, these two I = × − − (3) samples are considered to be adjacent inP ntheP Globaln MoPran’sn I. The specific2 process is as follows: First, wij (xi xm) the Global Moran’s I values with differenti=1 jdist=1 ance× bandsi=1 are− calculated for a certain whitened element, and a relationship curve between the I values and the distance bands is generated. The I where xi, xj are the whitened values at sampling location i and j; xm is the mean value of the whitened value usually decreases as the distance band increases, since concentrations of an element are often values at all locations; wij are the weights representing the proximity relationships between location i highlyand j, similar and is often within a functiona short distance of distance; and and the nsimilais therity total reduces number as of the samples. distance is larger. After the dropping in the early stages, the relationship curve gradually becomes stable as the distance band increases.2.3. Multi-CAE Second, the distance bands corresponding to the stable section of the curve with I values greater than 0 are selected. As mentioned before, a smaller absolute value of Moran’s I corresponds A convolutional autoencoder (CAE) integrates the merits of a convolutional neural network to stronger randomness. To capture the background spatial structure of the element and avoid the (CNN) and an autoencoder neural network (AE) [37,56]. The convolutional layer in CNN is used for interference of randomness, the distance band of the maximum absolute value of Moran’s I is selected sharing the local weights, and AE is employed for capturing the high information [56], from prior selected distance bands as the background recognition domains. i.e., the geochemical backgrounds. Based on CAE, the multi-convolutional autoencoder (MCAE) is the The Moran's I is defined as: model for multiple geochemical elements. A CAE corresponding to an element extracts the background 𝑛∑ ∑ 𝑤 (𝑥 −𝑥 )𝑥 −𝑥 spatial structure features and reconstructs the background with these features (as shown in Figure3). I = (3) MCAE has the following characteristics: (1)∑ the ∑ convolution 𝑤 ∑ windows(𝑥 −𝑥 (or) filters and kernels) can be whereused to𝑥 identify,𝑥 are the the whitened structure values features at sampling of the spatial location distribution; i and j; 𝑥 (2) is unsupervised the mean value training of the whitened can avoid valuesthe omission at all locations; of valid information𝑤 are the weights caused byrepresenting supervised the learning; proximity (3) relationships CAEs are mutually between independent, location i andavoiding j, and is the often mutual a function interference of distance; of the spatialand n is characteristics the total number of di offf erentsamples. elements; and (4) the method of parallel handling of multiple geochemical elements considerably accelerates the training time. 2.3. Multi-CAE A convolutional autoencoder (CAE) integrates the merits of a convolutional neural network (CNN) and an autoencoder neural network (AE) [37,56]. The convolutional layer in CNN is used for sharing the local weights, and AE is employed for capturing the high probability information [56], i.e., the geochemical backgrounds. Based on CAE, the multi-convolutional autoencoder (MCAE) is the model for multiple geochemical elements. A CAE corresponding to an element extracts the background spatial structure features and reconstructs the background with these features (as shown in Figure 3). MCAE has the following characteristics: (1) the convolution windows (or filters and kernels) can be used to identify the structure features of the spatial distribution; (2) unsupervised training can avoid the omission of valid information caused by ; (3) CAEs are mutually independent, avoiding the mutual interference of the spatial characteristics of different

Minerals 2019, 9, x FOR PEER REVIEW 6 of 23 Minerals 2019, 9, x FOR PEER REVIEW 6 of 23 Minerals 2019, 9, 270 6 of 23 elements;elements; and and (4) (4) the the method method ofof parallelparallel handlinghandling ofof multiplemultiple geochemicalgeochemical elementselements considerablyconsiderably acceleratesaccelerates the the training training time. time.

Figure 3. Structure of the Multiple-convolutional autoencoder (MCAE). FigureFigure 3. 3. Structure Structure of of the the Multiple-convolutional Multiple-convolutional autoencoderautoencoder (MCAE).(MCAE). MCAE includes two stages: CAEs and anomaly score calculation. CAEs complete the extraction MCAE includes two stages: CAEs and anomaly score calculation. CAEs complete the extraction of theMCAE spatial includes structure two features stages: and CAEs background and anomaly reconstruction, score calculation. and the CAEs anomaly complete score the calculation extraction is of the spatial structure features and background reconstruction, and the anomaly score calculation is usedof the to spatial calculate structure the diff featureserence between and background the input reconstruction, and reconstruction. and the anomaly score calculation is usedused to to calculate calculate the the difference difference between between the the input input and and reconstruction. reconstruction. 2.3.1. Convolutional (CAEs) 2.3.1.2.3.1. Convolutional Convolutional Autoencoders Autoencoders (CAEs) (CAEs) A CAE is composed of one or more Encoder–Decoder groups, as shown in Figure4. The compositionAA CAECAE isis composedcomposed of multiple ofof one groupsone oror more ismore similar Encoder–Encoder– to theDecoderDecoder embedded groups,groups, one. asas Thus, shownshown the in extractedin FigureFigure spatial4. The compositionstructurecomposition is cutof of multiple bymultiple the newly groups groups added is is similar similar encoder to to the the to embedded embedded meet certain one.one. application Thus,Thus, thethe extractedextracted requirements. spatialspatial An structurestructure encoder is cutincludescut by by the the a convolutional newly newly added added layer encoderencoder and a to poolingto meetmeet layer,certaincertain and aapplicationpplication the corresponding requirements.requirements. decoder AnAn includes encoderencoder an includes unpooling a convolutionallayerconvolutional and a deconvolution layer layer and and a a pooling layerpooling (as layer, layer, shown and and in the Figurethe cocorrespondingrresponding5). decoderdecoder includesincludes anan unpoolingunpooling layer andand a a deconvolution deconvolution layer layer (as (as shown shown in in Figure Figure 5). 5).

Figure 4. Convolutional autoencoder (CAE) with one or more Encoder–Decoder groups. FigureFigure 4. 4. Convolutional Convolutional autoencoder autoencoder (CAE) (CAE) with with one one or or moremore Encoder–DecoderEncoder–Decoder groups.groups.

• • SpatialSpatialSpatial information information information encoder encoder encoder part part part • The purpose of the encoder component is to capture the spatial structure features of geochemical backgrounds. The encoder involves the convolutional layer and the pooling layer.

Minerals 2019, 9, 270 7 of 23 Minerals 2019, 9, x FOR PEER REVIEW 7 of 23

The purpose of the encoder component is to capture the spatial structure features of geochemical In the convolutional layer, convolutional windows carrying the weight parameters slide in backgrounds. The encoder involves the convolutional layer and the pooling layer. an element map from the left-upper corner to right-lower corner for detecting the local spatial structure. In the convolutional layer, convolutional windows carrying the weight parameters slide in an Then, the feature maps are produced by convolutional calculations, as shown in Figure5. element map from the left-upper corner to right-lower corner for detecting the local spatial structure. In the pooling layer, the output of the convolutional layer is used as the input of pooling. Then, the feature maps are produced by convolutional calculations, as shown in Figure 5. The purpose of pooling is to abandon the local spatial details and preserve the most representative In the pooling layer, the output of the convolutional layer is used as the input of pooling. The local spatial information. This can prevent the model from overfitting. Pooling is carried out using purpose of pooling is to abandon the local spatial details and preserve the most representative local twospatial methods, information. namely, This mean-pooling can prevent andthe model max-pooling. from . Max-pooling Pooling is currently is carried theout most using widely two usedmethods, method. namely, When mean-pooling dealing with and regional max-pooling. geochemical Max-pooling data, pooling is currently capturing the most the maximum widely used help modelmethod. avoids When learning dealing allwith of regional the neurons geochemical of the feature data, pooling map, leading capturing to thethe preventionmaximum help of anomaly model learning.avoids learning Similar all to of the the convolution neurons of layer,the feature the max-pooling map, leading calculation to the prevention is carried of anomaly out using learning. a sliding filterSimilar to operate to the convolution the maximum. layer, In thisthe process,max-pooling the model calculation records is thecarried maximum out using value a sliding and the filter position to informationoperate the ofmaximum. the maximum In this value process, in the the area model covered records by the the maximum filter. These value data and are the used position for the un-poolinginformation of of the the decoder maximum part. value in the area covered by the filter. These data are used for the un- poolingSpatial of information the decoder decoderpart. component • • TheSpatial decoder information component decoder involving component the de-convolutional layer and the un-pooling layer is used for reconstructingThe decoder the component input with involving the features the de-convolutional from the encoder layer (Figure and 5th).e Generally,un-pooling the layer max-pooling is used for operationreconstructing is non-invertible the input with in CNN.the features However, from wethe can encoder obtain (Figure an approximate 5). Generally, inverse the max-pooling by using the maximumoperation valueis non-invertible and its position in CNN. recorded However, in the pooledwe can processobtain an to obtainapproximate the inverse inverse pooled by using operation. the Thus,maximum in the value unpooling and its layer, position we recorded use the unpooling in the pooled filter process to decode to obtain the informationthe inverse pooled from theoperation. pooling encoder.Thus, in Specifically,the unpooling in thelayer, unpooling we use the process, unpooling according filter to to decode the location the info informationrmation from recorded the pooling during pooling,encoder. the Specifically, maximum in value the unpoo is placedling inprocess, the corresponding according to position,the location and information the other positions recorded areduring filled withpooling, 0 (Figure the maximum5). The preservation value is placed of the in structurethe corresp ofonding the stimulus position, is anand advantage the other positions of this process are filled [ 57 ]. withIn 0 the(Figure de-convolutional 5). The preservation layer, theof the transposed structure convolutionof the stimulus is usedis an advantage to restore theof this data process compressed [57]. by convolutionIn the de-convolutional (Figure5). Previous layer, researchthe transposed has proven convolution that transposed is used to convolutionrestore the data can compressed generate the convolutionalby convolution image (Figure with 5). thePrevious original research input has resolution proven that [57]. transposed Here, similar convolution to the image can generate convolution, the de-convolutionconvolutional image also employs with the a original sliding input filter toresolution complete [57]. the Here, matrix similar operation. to the De-convolutionimage convolution, uses de- the transposedconvolution convolution also employs to restorea sliding the filter input to beforecomple convolution,te the matrix whichoperation. is a decodingDe-convolution process uses [35 ].the transposed convolution to restore the input before convolution, which is a decoding process [35].

Figure 5. Principles of encoder and decoder.

Minerals 2019, 9, 270 8 of 23

Objective function of CAE neural network • Each CAE is separately trained, and the CAEs do not interfere with each other. Here, we take the mean square error (MSE) as the cost function of CAE:

w h 1 X X L = D(i, j) X(i, j) 2 (4) wh − | i=1 j=1 where w and h are the width and height of each sample, respectively; D is the data passing through the model at position (i,j); and X is the data input to the model at position (i,j). We use the stochastic (SGD) method to train the model and randomly initialize the weight parameters of each layer.

2.3.2. Anomaly Score Calculation and Anomalies Map Generation The Euclidean distance is used to measure the difference between two multi-feature samples. A larger distance corresponds to a more anomalous sample. Euclidean distance can reflect the absolute difference of individual numerical characteristics, and therefore, it is commonly used for analyzing numerical difference in m dimensional space, for example, as a measure of anomaly scores; here, we number the elements from 1 to n. The outliers of each sample are calculated as follows: v t n X 2 d = x x (5) i ik − ik0 k=1

Here, xik denotes the k-th element attribute value of sample i and xik0 represents the reconstitution of xik. Finally, the anomaly scores di are projected to the corresponding geographical locations to generate an anomaly map.

3. Experiment and Evaluation

3.1. Study Area and Data To assess its performance for geochemical anomaly recognition, the proposed MCAE approach was applied to the Fe metallogenic belt located in the southwest Fujian depression center. This region is characterized by large areas of Paleozoic stratigraphy and intermediate acidic magmatic rocks from the Yanshan period that provide a material base and metallogenic environment for the formation of iron, copper, and zinc polymetallic ore deposits [58,59]. Several skarn-type Fe deposits have been found in this area (Figure6). Minerals 2019, 9, x FOR PEER REVIEW 9 of 23 Minerals 2019, 9, 270 9 of 23

Figure 6. Geological map of the study area (data from China Geological Survey [60]). Figure 6. Geological map of the study area (data from China Geological Survey [60]). To examine the performance of MCAE, the stream sediment geochemical dataset at a scale of 1:200,000 collected at 2 km 2 km grids in a regional geochemical survey as part of the Chinese To examine the performance× of MCAE, the stream sediment geochemical dataset at a scale of 1:200,000National collected Geochemical at 2 Mappingkm × 2 km (CNGM) grids projectin a regional [60] was geochemical used in the experiment.survey as part The of dataset the Chinese includes National39 content Geochemical values for eachMapping sample. (CNGM) Previous project studies [60] have was indicated used in that the Cu,experiment. Mn, Pb, Zn, The and dataset Fe2O 3 includesare most 39 strongly content relatedvalues for to theeach Fe sample. deposits Previo in theus regionstudies [ 61have,62], indicated and therefore, that Cu, their Mn, concentration Pb, Zn, and values were used for the anomaly recognition experiment. All of the concentration values were Fe2O3 are most strongly related to the Fe deposits in the region [61,62], and therefore, their concentrationtransformed intovalues the were range used of [0, for 1] the using anomaly min–max recognition normalization, experiment. as shown All of in the Figure concentration7. Figure7 valuesshows were that transformed the spatial distribution into the range of the of five[0, 1] elements using min–max presents normalization, a certain regularity. as shown The in distribution Figure 7. Figureof geochemical 7 shows that elements the spatial in hypergenic distribution media of the is thefive result elements of the presents superposition a certain of regularity. many complex The distributiongeological processesof geochemical [63]. elements The spatial in hypergenic regularity ofmedia the elementsis the result is dueof the to superposition the combination of many of the complexchanges geological of the main processes geological [63]. action, The spatial while regularity the secondary of the geologicalelements is action due to is the manifested combination by theof therandomness changes of of the the main elements geological [64]. action, while the secondary geological action is manifested by the randomness of the elements [64].

Minerals 2019, 9, 270 10 of 23 Minerals 2019, 9, x FOR PEER REVIEW 10 of 23

Figure 7. Geographic distributions of Cu-Fe2O3-Mn-Pb-Zn. Figure 7. Geographic distributions of Cu-Fe2O3-Mn-Pb-Zn. 3.2. Specific Implementation Process 3.2. Specific Implementation Process 3.2.1. ZCA Whitening 3.2.1.Pretreatment ZCA Whitening whitening makes CAE learn the unique spatial structure information of an element, ratherPretreatment than the repeated whitening structure makes between CAE elements.learn the Weunique generated spatial the structure correlation information matrices ofof thean sampleselement, beforerather andthan afterthe repeated whitening structure using thebetween Spearman elements. method We (seegenerated Tables the1 and correlation2 below). matrices In the entireof the studysamples area, before the correlationsand after whitening between using Cu and the Pb, Spearman Mn and method Zn, Zn (see and Table Cu, and 1,2 Febelow).2O3 and In Cuthe wereentire moderate study area, (0.3 the< |correlationsr| < 0.7), while between other correlationsCu and Pb, wereMn and low. Zn, After Zn and whitening, Cu, and the Fe correlation2O3 and Cu coewereffi cientsmoderate of the (0.3 elements < |r| < were0.7), allwhile below other 0.35. correlations were low. After whitening, the correlation coefficients of the elements were all below 0.35. Table 1. Correlation matrix of not-whitened elements.

ElementTable 1. CorrelationCu matrix Mn of not-whitened Pb Znelements. Fe 2O3

ElementCu 1Cu Mn Pb Zn Fe2O3 Mn 0.01 ** 1 Cu − 1 Pb 0.65 ** 0.01 ** 1 − ZnMn −0.530.01** ** 0.541 ** 0.03 ** 1 − − − Fe2OPb3 −0.410.65** ** 0.01**0.21 ** 0.051 ** 0.19 ** 1 − − − Zn** Correlation−0.53** is significant−0.54** at the−0.03** 0.01 level (2-tailed).1 Fe2O3 −0.41** −0.21** −0.05** 0.19** 1 Table 2. Correlation matrix of whitened elements. ** Correlation is significant at the 0.01 level (2-tailed).

Element Cu Mn Pb Zn Fe2O3 Table 2. Correlation matrix of whitened elements. Cu 1 Mn 0.21 ** 1 Element − Cu Mn Pb Zn Fe2O3 Pb 0.33 ** 0.28 ** 1 Cu − 1 − Zn 0.29 ** 0.26 ** 0.32 ** 1 − − − MnFe2O 3 −0.130.21** ** 0.12 **1 0.18 ** 0.16 ** 1 − − − − Pb ** Correlation−0.33** is significant−0.28** at the 0.01 level1 (2-tailed). Zn −0.29** −0.26** −0.32** 1 Fe2O3 –0.13** –0.12** –0.18** –0.16** 1 ** Correlation is significant at the 0.01 level (2-tailed).

Minerals 2019, 9, 270 11 of 23

3.2.2. Global Moran’s I Global Moran’s I values were calculated with Equation (3) using ESRI@ ArcGIS software. The function of distance was defined using inverse distance weighting (IDW) based on the Euclidian distance. Distance bands were used for distinguishing between “close” and “not so close” locations. The “not so close” elements were ignored when calculating the Moran’s I value. To determine the background recognition domains of Cu-Fe2O3-Mn-Pb-Zn in the study area, we calculated the global Moran’s I values in different distance bands. As shown in Figure8, the global Moran’s I values decreased when the distance band increased, and the five curves were similar. When the distance band was less than 48 km, the Moran’s I value was between 0.1 and 0.7, showing that the Moran’s I values differed significantly between different distance Minerals 2019, 9, x FOR PEER REVIEW 11 of 23 bands, indicating the unstable spatial structures in these recognition domains. When the distance band was3.2.2. between Global Moran’s 48 and 64 I km, the Moran’s I value decreased slowly with the increasing distance bands, with smaller differences at different distance bands. This demonstrates that the spatial structures in @ theseGlobal recognition Moran’s domains I values were were relatively calculated stable. with When Equation the distance(3) using bandESRI was ArcGIS greater software. than 64 The km, thefunction Moran’s of distance I value started was defined to approach using 0,inverse the spatial distance structure weighting was unclear, (IDW) based and the on regular the Euclidian patterns ofdistance. the background Distance bands spatial were changes used were for distinguishing difficult to find. between Consequently, “close” and the “not background so close” recognition locations. The "not so close" elements were ignored when calculating the Moran’s I value. To determine the domain of each element was set to 48–64 km, in which case CAE could easily extract the background background recognition domains of Cu-Fe2O3-Mn-Pb-Zn in the study area, we calculated the global spatial structure. Finally, those recognition domains were set to 50 km. Moran’s I values in different distance bands.

Figure 8. Variation of the global Moran I index with different distance bands. Figure 8. Variation of the global Moran I index with different distance bands. 3.2.3. MCAE Structure and Training As shown in Figure 8, the global Moran’s I values decreased when the distance band increased, and Thethe five MCAE curves included were fivesimilar. CAEs When corresponding the distance to band five elements was less (Cu-Fethan 482O km,3-Mn-Pb-Zn). the Moran’s Each I value CAE waswas independentlybetween 0.1 and trained 0.7, showing and included that the oneMoran’s Encoder–Decoder I values differed . significantly The convolution between different window, actingdistance as bands, the spatial indicating structure the unstable feature extractionspatial structures tool, was in these key forrecognition the entire domains. modeling When process. the Thedistance hyper-parameter band was between settings 48 of and the 64 convolution km, the Mora windown’s I value were decreased highly important slowly with and the could increasing directly adistanceffect the accuracybands, with of the smaller background differences reconstruction. at different The distance hyper-parameter bands. This includeddemonstrates mainly that the the size andspatial number. structures The backgroundin these recognition recognition domains domain were obtained relatively through stable. GlobalWhen the Moran’s distance I was band used was for settinggreater the than size. 64 km, This the number Moran’s is generallyI value started set based to approach on experience. 0, the spatial A greater structure number was of unclear, convolution and windowsthe regular will patterns lead to of more the detailsbackground in the spatial reconstruction. changes were This difficult means thatto find. the reconstructionConsequently, maythe includebackground much recognition invalid information domain of each that element will interfere was set with to 48–64 anomaly km, in recognition. which case CAE If the could number easily of theextract convolution the background windows spatial is too structure. small, theFinally, spatial those structure recognition feature domains extraction were willset to be 50 insu km.ffi cient, also causing the background reconstruction to be inaccurate. To find a suitable value, we examined 3.2.3. MCAE Structure and Training

The MCAE included five CAEs corresponding to five elements (Cu-Fe2O3-Mn-Pb-Zn). Each CAE was independently trained and included one Encoder–Decoder group. The convolution window, acting as the spatial structure feature extraction tool, was key for the entire modeling process. The hyper-parameter settings of the convolution window were highly important and could directly affect the accuracy of the background reconstruction. The hyper-parameter included mainly the size and number. The background recognition domain obtained through Global Moran's I was used for setting the size. This number is generally set based on experience. A greater number of convolution windows will lead to more details in the reconstruction. This means that the reconstruction may include much invalid information that will interfere with anomaly recognition. If the number of the convolution windows is too small, the spatial structure feature extraction will be insufficient, also causing the background reconstruction to be inaccurate. To find a suitable value, we examined the variation in

Minerals 2019, 9, 270 12 of 23 Minerals 2019, 9, x FOR PEER REVIEW 12 of 23 the variationreconstruction in the error reconstruction by using errora variety by using of CA aE variety models of with CAE a models different with number a different of convolution number of convolutionwindows (Figure windows 9). (Figure9).

Figure 9. (A) Variation inin thethe reconstruction reconstruction error error with with the the number number of of training training epochs. epochs. (B ()B Variation) Variation in thein the total total reconstruction reconstruction error error with with diff dierentfferent numbers numbers of convolutionof convolution windows. windows. Figure9 shows the total errors as the sum of the five element reconstruction errors calculated Figure 9 shows the total errors as the sum of the five element reconstruction errors calculated when the reconstruction error curve was trained to be stable (Figure9A). A stable reconstruction when the reconstruction error curve was trained to be stable (Figure 9A). A stable reconstruction error meant that the model was considered "nearly well-trained". When CAE had 1–15 convolution error meant that the model was considered "nearly well-trained". When CAE had 1–15 convolution windows, the total reconstruction error decreased faster with an increasing number of convolution windows, the total reconstruction error decreased faster with an increasing number of convolution windows. When the number was between 16 and 35, the error fluctuated smoothly. This shows that in windows. When the number was between 16 and 35, the error fluctuated smoothly. This shows that the investigated area, when the number of convolution windows was greater than 15, the CAEs were in the investigated area, when the number of convolution windows was greater than 15, the CAEs not being underfit. To avoid the excessive convolution windows resulting in overfitting and increasing were not being underfit. To avoid the excessive convolution windows resulting in overfitting and computational complexity, each CAE had 16 convolution windows. increasing computational complexity, each CAE had 16 convolution windows. The CAE network structure was based on , which is a Python library for deep learning that The CAE network structure was based on Keras, which is a Python library for deep learning that can run on top of TensorFlow frameworks. can run on top of TensorFlow frameworks. The CAE network parameters are as follows: The CAE network parameters are as follows: The encoder: The encoder: A. Input:Input: 78 78 × 8888 geochemical geochemical element element concentration concentration map. map. × B. C1:C1: convolution convolution filter filter was was 25 25 × 25,25, the the number number of of convolution convolution kernels kernels was 16, and the nonlinear × functionfunction was was Rectified Rectified Linear Linear Unit Unit (ReLU). (ReLU). C. P1:P1: max max pooling pooling filter filter was was 2 2 × 2.2. × The decoder: A. Un-P1:Un-P1: the the size size of of un-pooling un-pooling area area was was 2 2 × 2.2. × B. De-C1:De-C1: the the de-convolution de-convolution filter filter was 25 × 25,25, the the number number of of convolution convolution kernels was 16, and the × nonlinearnonlinear function function was was ReLU. C. Output:Output: the the geochemical geochemical background background map map had had 78 × 8888 grids grids of of 2 2 km km * 2 km resolution, and the × nonlinearnonlinear function function was was Sigmoid. The background recognition domain of each elementelement waswas 5050 kmkm × 50 km from the Moran value × (as 3.2(2)) and and the the input input was was a a2 km 2 km × 2 km2 kmgrid. grid. Therefore, Therefore, the convolution the convolution window window size was size set was to 25 × 25. The five CAEs had the same network× structure but different initial weights (w,b). The weights

Minerals 2019, 9, 270 13 of 23

Minerals 2019, 9, x FOR PEER REVIEW 13 of 23 set to 25 25. The five CAEs had the same network structure but different initial weights (w,b). were randomly ×derived from a zero mean unit variance Gaussian N (0,1). After CAE training, The weights were randomly derived from a zero mean unit variance Gaussian N (0,1). After CAE reconstruction of the five elements was completed. training, reconstruction of the five elements was completed. 3.2.4. Anomalies map generation 3.2.4. Anomalies Map Generation Using Equation (5), we obtained the difference between the input and reconstructed data of each Using Equation (5), we obtained the difference between the input and reconstructed data of each sample that was considered as the anomaly score of each sample. Then, these anomaly scores were sample that was considered as the anomaly score of each sample. Then, these anomaly scores were projected to the corresponding geographical locations to generate an anomaly map (Figure 10) projected to the corresponding geographical locations to generate an anomaly map (Figure 10).

Figure 10. Anomaly map generated by MCAE, A,B,C, respectively, show the location relationship Figure between10. Anomaly the known map generated Fe deposits by and MCAE, the recognized A,B,C, resp anomalies.ectively, show the location relationship between the known Fe deposits and the recognized anomalies. Figure 10 shows that most of the known Fe deposits were in areas of high anomaly scores, Figureand a10 few shows were that close most to theof the high known anomaly Fe depos area (markedits were in as areas C in Figureof high 10 anomaly). This suggests scores, and that these a few anomalieswere close mayto the be relatedhigh anomaly to iron mineralization,area (marked as and C otherin Figure areas 10). with This higher suggests anomaly that scores these indicate anomaliesthe possibilitymay be related of finding to iron Fe mineralization, deposits in the and studied other area. areas with higher anomaly scores indicate the possibility of finding Fe deposits in the studied area. 3.3. Performance Evaluation 3.3. Performance Evaluation 3.3.1. Receiver Operating Characteristic Curve 3.3.1. ReceiverThe Operating Receiver Operating Characteristic Characteristic Curve (ROC) Curve is often used to measure the predictive ability of a model [65]. A ROC curve that is steep and toward the upper left corner of the plot indicates The Receiver Operating Characteristic (ROC) Curve is often used to measure the predictive good performance of a model. Previous studies showed that an Area Under the Curve (AUC) greater ability of a model [65]. A ROC curve that is steep and toward the upper left corner of the plot indicates than 0.5 and close to 1 often indicates that the prediction model of metalorganic prognosis achieved good performance of a model. Previous studies showed that an Area Under the Curve (AUC) greater than 0.5 and close to 1 often indicates that the prediction model of metalorganic prognosis achieved

Minerals 2019, 9, 270 14 of 23 Minerals 2019, 9, x FOR PEER REVIEW 14 of 23 an acceptable accuracy [66]. [66]. In In our our experiment experiment,, ROC ROC was was used used to to measure the coincidence degree between the areas with high anomaly scores and the known Fe deposits. The anomaly anomaly recognition recognition results results from fromMCAE, MCAE, Mahalanobis Mahalanobis Distance Distance(MAHAL) (MAHAL)[67], – [67], areaSpectrum–area model (SA) model [68], and (SA) deep [68 ],belief and nets deep (DBN) belief [23] nets were (DBN) compared. [23] were As compared. shown in Figure As shown 11, the in AUCFigure of11 ,MCAE the AUC was of MCAE considerably was considerably higher than higher those than thoseof other of other methods, methods, indicating indicating MCAE MCAE outperformed other methods in geochemicalgeochemical anomaly recognition.

Figure 11. Receiver Operating Characteristics (ROCs) ob obtainedtained by different different approaches. DBN: deep belief nets; SA: spectrum–area model; MAHAL: MahalanobisMahalanobis distance; AUC: area underunder thethe curve.curve.

3.3.2. Weights-of-evidence Weights-of-Evidence and and Student’s Student’s t-value t-Value The weights-of-evidence (WofE) (WofE) is a multivariable information synthesis method based on the Bayesian that has been used to investigate the significance significance of spatial correlation between multiclass multiclass patterns patterns of of a geological a geological feature feature and and a set a of set mineral of mineral deposits deposits [69]. In [69 WofE,]. In WofE, each + metallogeniceach metallogenic variate variate served as served a layer as of a ev layeridence of that evidence had a positive that had weight a positive value weight(W+) and value a negative (W ) weightand a negative value (W weight–), representing value (W −the), representing influences of the the influences presence ofand the absence presence of the and metallographic absence of the evidencemetallographic layer [70]. evidence The difference layer [70]. between The diff erencethe positive between and the negative positive weights and negative was defined weights as wasthe contrastdefined asdifference the contrast in WofE difference and was in WofE denoted and as was C. denotedStudent’s as t-values C. Student’s were t-valuesused to weremeasure used the to significancemeasure the of significance the contrast of difference the contrast C, diandfference was calculated C, and was as: calculated as:

t = tC/S(C)= C/S(C) (6) where S(C) stands for the standardstandard deviation of C. A larger t value corresponded to a more significan significantt C and stronger controlling effect effect of the ore controlling factors on the ore deposits. It It was generally believed that there was significantsignificant spatial correlation for t > 1.961.96 and and 95% 95% confidence confidence [71]. [71]. In Table Table3 3,, thethe firstfirst columncolumn (t-value)(t-value) showsshows whetherwhether aa resultresult waswas spatiallyspatially correlatedcorrelated withwith knownknown Fe deposits. The second columncolumn (AUC)(AUC) indicates indicates the the correct correct identification identification probability probability of of the the known known Fe Fedeposits. deposits. The The third third column column shows shows the the range range of anomalyof anomaly recognition, recognition, denoting denoting the the proportion proportion of the of theidentified identified anomaly anomaly grids grids [24]. [24]. The fourthThe fourth column column shows shows the ratio the ofratio the of number the number of Fe deposits of Fe deposits falling falling in the forecast area to the total number of known Fe deposits [24]. It is clear that MCAE outperformed the other methods in all aspects.

Minerals 2019, 9, 270 15 of 23 in the forecast area to the total number of known Fe deposits [24]. It is clear that MCAE outperformed the other methods in all aspects.

Table 3. Comparison of various models.

Model t-Value AUC Area of Forecasting Correct Ratio MAHAL 3.044 0.800 0.117 0.632 SA 3.668 0.816 0.211 0.790 DBN 3.519 0.776 0.309 0.737 MCAE 4.949 0.894 0.174 0.895

4. Discussion The key innovation of MCAE is the use of spatial similarity theory to determine the recognition domains of spatial structures of geochemical backgrounds. If the domain is not clear, the size of the convolution window of MCAE will be set to an incorrect value, leading to inaccurate background reconstruction. To explore this problem in greater depth, we set up MCAEs with different convolution window sizes for observing their anomaly recognition results. According to the Global Moran’s I value curve (Figure8), we divide the models into three groups: Group 1 (0–48 km), Group 2 (48–64 km), and Group 3 (64–108 km). The MCAE convolution window sizes were set to 5 5 grids (Group 1), × 10 10 grids (Group 1), 30 30 grids (Group 2), 40 40 grids (Group 3), and 45 45 grids (Group 3), × × × × and the window sizes of all of the pooling layers were 2 2. The samples in the study area were × collected in a 2 km 2 km grid. Therefore, the corresponding domains of these convolution window × sizes were 10 km 10 km (Group 1), 20 km 20 km (Group 1), 60 km 60 km (Group 2), 80 km 80 km × × × × (Group 3), and 90 km 90 km (Group 3). After the three sets of models were trained, we evaluated × their anomaly recognition results, as shown in Figure 12. Figure 12 shows significant differences of the evaluation indicators. Group 2 was superior to Group 1 and Group 3 in all aspects. Figure 13 shows their anomaly recognition results. It was observed that the anomaly areas of the 10 km 10 km, 20 km 20 km, 80 km 80 km, and 90 km 90 km × × × × models were relatively large, and the recognized anomalies were less consistent with the known Fe deposits. MCAE with the convolution window size of 60 km 60 km (Group 2) performed better; × the recognized anomalies were more relevant to the known Fe deposits and the anomaly areas were smaller. This result indicated that the distance band defined by stable values above 0 along the Moran’s I curve can be used for the convolution size setting of MCAE. It was necessary to explore whether multiple Encoder–Decoder groups could help improve the extraction ability of spatial structure features of MCAE. We analyzed the outputs of MCAEs composed of one, two, and three Encoder–Decoder groups. To eliminate the impact of the weight parameter differences, the modeling process of the three MCAEs was as follows: first an MCAE including an Encoder–Decoder group was trained and its anomaly result (R1) was obtained. Then, a new Encoder–Decoder group was added to the MCAE that was trained to obtain the new anomaly result (R2). In this process, MCAE retained the previously trained weights and only trained the weights of the newly added group. MCAE including three Encoder–Decoder groups generated the anomaly result (R3). Generally, multiple encoders enable more precise feature extraction, easily obtaining the reconstruction that is highly similar to the input. For example, in image processing, an increased number of encoders leads to an increased level of detail of the reconstructed image. This happens because the latter encoder breaks the spatial features extracted by the previous encoder, forming a smaller reconstruction unit. However, in geochemical background reconstruction, this would destroy the background spatial structure that had been extracted previously, hindering the accurate reconstruction of the background. In , for the window size setting of multiple convolution layers, we used the conventional approach, with each convolution layer half the size of the previous convolution layer, which was beneficial for reducing the computational complexity. Minerals 2019, 9, 270 16 of 23 Minerals 2019, 9, x FOR PEER REVIEW 16 of 23

FigureFigure 12. 12. Average–VarianceAverage–Variance bars of forecasting area (AF), correct ratio (CR), AUC for Groups 1–3.1–3.

Figure 12 shows significant differences of the evaluation indicators. Group 2 was superior to Group 1 and Group 3 in all aspects. Figure 13 shows their anomaly recognition results. It was observed that the anomaly areas of the 10 km × 10 km, 20 km × 20 km, 80 km × 80 km, and 90 km × 90 km models were relatively large, and the recognized anomalies were less consistent with the known Fe deposits. MCAE with the convolution window size of 60 km × 60 km (Group 2) performed better; the recognized anomalies were more relevant to the known Fe deposits and the anomaly areas were smaller. This result indicated that the distance band defined by stable values above 0 along the Moran’s I curve can be used for the convolution size setting of MCAE.

Minerals 2019, 9, 270 17 of 23 Minerals 2019, 9, x FOR PEER REVIEW 17 of 23

FigureFigure 13. 13.Anomaly Anomaly maps maps with with di differentfferent convolutionconvolution windowwindow sizes of of 10 10 km km × 1010 km, km, 20 20 km km × 20 20km, km, × × 6060 km km 60× 60 km, km, 80 80 km km ×80 80 km,km, andand 90 km × 9090 km. km. × × ×

Minerals 2019, 9, x FOR PEER REVIEW 18 of 23

It was necessary to explore whether multiple Encoder–Decoder groups could help improve the extraction ability of spatial structure features of MCAE. We analyzed the outputs of MCAEs composed of one, two, and three Encoder–Decoder groups. To eliminate the impact of the weight parameter differences, the modeling process of the three MCAEs was as follows: first an MCAE including an Encoder–Decoder group was trained and its anomaly result (R1) was obtained. Then, a new Encoder–Decoder group was added to the MCAE that was trained to obtain the new anomaly result (R2). In this process, MCAE retained the previously trained weights and only trained the weights of the newly added group. MCAE including three Encoder–Decoder groups generated the anomaly result (R3). Generally, multiple encoders enable more precise feature extraction, easily obtaining the reconstruction that is highly similar to the input. For example, in image processing, an increased number of encoders leads to an increased level of detail of the reconstructed image. This happens because the latter encoder breaks the spatial features extracted by the previous encoder, forming a smaller reconstruction unit. However, in geochemical background reconstruction, this would destroy the background spatial structure that had been extracted previously, hindering the accurate reconstruction of the background. In addition, for the window size setting of multiple convolutionMinerals 2019, 9 layers,, 270 we used the conventional approach, with each convolution layer half the size18 of of 23 the previous convolution layer, which was beneficial for reducing the computational complexity. The network structure with one Encoder–Decoder group was set to 25-2-2-25, that with two The network structure with one Encoder–Decoder group was set to 25-2-2-25, that with two Encoder–Decoder groups was set to 25-2-13-2-2-13-2-25, and that with three Encoder–Decoder Encoder–Decoder groups was set to 25-2-13-2-2-13-2-25, and that with three Encoder–Decoder groups groups was set to 25-2-13-2-7-2-2-7-2-13-2-25. The layer with a size of two was the pooling layer or was set to 25-2-13-2-7-2-2-7-2-13-2-25. The layer with a size of two was the pooling layer or unpooling unpooling layer. For more than three groups, the excessive number of layers led to over-fitting, thus, layer. For more than three groups, the excessive number of layers led to over-fitting, thus, these are not these are not discussed in the comparison with the experimental data. discussed in the comparison with the experimental data. Figure 14 shows that as the number of the Encoder–Decoder groups increased, the Figure 14 shows that as the number of the Encoder–Decoder groups increased, the reconstruction reconstruction error was gradually reduced, indicating that the reconstructed data became more error was gradually reduced, indicating that the reconstructed data became more similar to the original similar to the original data with a greater number of Encoder–Decoder groups. However, the data with a greater number of Encoder–Decoder groups. However, the reduction of the reconstruction reduction of the reconstruction error did not represent the accuracy of background reconstruction. error did not represent the accuracy of background reconstruction. The smaller units reconstructed The smaller units reconstructed unnecessary details, resulting in the reduction of the accuracy of unnecessary details, resulting in the reduction of the accuracy of background reconstruction. Table4 background reconstruction. Table 4 shows the evaluation of the results. It was observed that the shows the evaluation of the results. It was observed that the correlation between the identified anomalies correlation between the identified anomalies and the known Fe deposits decreased with increasing and the known Fe deposits decreased with increasing number of the Encoder–Decoder groups. number of the Encoder–Decoder groups.

Figure 14. Total Error of R1, R2, R3. Figure 14. Total Error of R1, R2, R3. Table 4. Comparison of various structures.

Model t-Value AUC Area of Forecasting Correct Ratio 25-2-2-25 4.949 0.894 0.174 0.895

25-2-13-2-2-13-2-25 3.670 0.841 0.217 0.794 25-2-13-2-7-2-2-7-2-13-2-25 3.666 0.831 0.243 0.768

It was necessary to explore the influence of element decorrelation on MCAE. We mapped the multi-geochemical anomaly map with MCAE for the whitened elements and the not-whitened elements. To avoid the interference of different structures, the network structures of MCAE were randomly selected and were 10-2-5-2-2-5-2-10, 25-2-2-25, and 40-2-20-2-10-2-2-10-2-20-2-40. We evaluated the results of these structures with whitening and with not-whitened (Figure 15). From the AUC and T-value indicators, it is clear that whitening had a significant impact on the results, and the results more closely fit the known Fe deposits. This suggests that whitening reduced useful information resulting from redundancy. Minerals 2019, 9, x FOR PEER REVIEW 19 of 23

Table 4. Comparison of various structures.

Model t-Value AUC Area of Forecasting Correct Ratio 25-2-2-25 4.949 0.894 0.174 0.895 25-2-13-2-2-13-2-25 3.670 0.841 0.217 0.794 25-2-13-2-7-2-2-7-2-13-2-25 3.666 0.831 0.243 0.768

It was necessary to explore the influence of element decorrelation on MCAE. We mapped the multi-geochemical anomaly map with MCAE for the whitened elements and the not-whitened elements. To avoid the interference of different structures, the network structures of MCAE were randomly selected and were 10-2-5-2-2-5-2-10, 25-2-2-25, and 40-2-20-2-10-2-2-10-2-20-2-40. We evaluated the results of these structures with whitening and with not-whitened (Figure 15). From the AUC and T-value indicators, it is clear that whitening had a significant impact on the results, and the resultsMinerals 2019more, 9, 270closely fit the known Fe deposits. This suggests that whitening reduced useful19 of 23 information resulting from redundancy.

Figure 15. EvaluationEvaluation of of the the results results with with whit whitenedened data and not-whitened data.

5. Conclusions Conclusions The spatial structure of regional geochemical elements represents represents the synthesis of the regular changes of of various main main geological geological processes processes [72] [72].. We We used the spatial structure features as as the reconstruction component component to to reconstruct reconstruct the the geoc geochemicalhemical background background accurately, accurately, providing providing a a new concept and method for the identification identification of geochemicalgeochemical anomalies. Th Thee determination of the recognition domains domains of of the background spatial structure of geochemical elements was was the focus of thisthis paper. paper. Generally, Generally, the size of the convolution window is is set to a default value that is usually set to 3 3, 5 5 [56,73]. For instance, in image processing, the convolution window with 3 3, 5 5 to 3 ×× 3, 5 ×× 5 [56,73]. For instance, in image processing, the convolution window with 3 × 3,× 5 × 5 can× extractcan extract the boundary the boundary of the of graph the graph well well and and achiev achievee good good image image reconstructi reconstructionon [19,28]. [19,28 However,]. However, in geochemicalin geochemical exploration, exploration, this this approach approach led led to to unsatisfactory unsatisfactory performance performance of of CAE CAE because because the convolution window sizes were not set as as the the backgr backgroundound domain recognition recognition (see section Section 44.).). InIn thisthis work, thethe useuse of of Moran’s Moran’s I index I index to determine to determine the recognition the recognition domains domains of background of background spatial structure spatial structurehelped MCAE helped to accuratelyMCAE to extractaccurately the backgroundextract the spatialbackground structure spatial features structure and reconstruct features and the reconstructbackgrounds. the This backgrounds. method was This applied method to thewas analysis applied of to the the data analysis from the of southwesternthe data from Fujian the southwesternprovince (China). Fujian The province anomaly (China). result exhibited The anomaly a strong result spatial exhibited correlation a strong with spatial the known correlation Fe deposits. with the knownThe method Fe deposits. used in this study still has some limitations as follows: (1) the method is limited to the applicationThe method of used regular in this grid study sampling still has of some geochemical limitations samples; as follows: and (2) (1) the the lackmethod of consideration is limited to theof theapplication spatial structure of regular extraction grid sampling of multi-source of geochemical spatial samples; data (suchand (2) as the remote lack of sensing consideration data and of thegeophysical spatial structure data). Therefore, extraction future of multi-source research will spatial focus ondata improving (such asthe remote MCAE sensing in the followingdata and geophysicalrespects: (1) data). the limitation Therefore, of regularfuture research grid sampling will focus for theon inputimproving data mustthe MCAE be eliminated, in the following and the respects:method of (1) mapping the limitation multivariate of regular anomalies grid sampli for irregularng for sampled the input data data should must be be developed; eliminated, and and (2) the for multiple sources of spatial data, a suitable spatial structure extraction and fusion method should be developed.

Author Contributions: Q.G. provided the idea of processing data. L.C. and Q.G. wrote this paper. F.Z., H.Y., B.F. and J.W. retrieved and processed geochemical data. Funding: This research was supported by the National Key Research and Development Program of China (No. 2016YFC0600508), National Natural Science Foundation of China (No. 41671408 and 41301426), and the Natural Science Foundation of Hubei Province (Grant No. 2017CFA041). Acknowledgments: We sincerely thank Renguang Zuo at China University of Geosciences for kindly providing the data for our experiments. We sincerely thank the editor and three anonymous reviewers for constructive comments and suggestions. Conflicts of Interest: The authors declare no conflict of interest. Minerals 2019, 9, 270 20 of 23

References

1. Hawkes, H.E.; Webb, J.S. Geochemistry in Mineral Exploration. Soil Sci. 1963, 95, 283. [CrossRef] 2. Zuo, R. of Mineralization-Related Geochemical Anomalies: A Review of Potential Methods. Nonrenew. Resour. 2017, 26, 457–464. [CrossRef] 3. Zaremotlagh, S.; Hezarkhani, A. The use of decision tree induction and artificial neural networks for recognizing the geochemical distribution patterns of LREE in the Choghart deposit, Central Iran. J. Afr. Earth Sci. 2017, 128, 37–46. [CrossRef] 4. Wang, X.Q. Exploration geochemistry: Past achievements and future challenges. Earth Sci. Front. 2003, 10, 239–248. 5. Liu, Y.; Zhou, K.; Cheng, Q. A new method for geochemical anomaly separation based on the distribution patterns of singularity indices. Comput. Geosci. 2017, 105, 139–147. [CrossRef] 6. Zuo, R.; Xiong, Y. Big data analytics of identifying geochemical anomalies supported by machine learning methods. Nat. Resour. Res. 2018, 27, 5–13. [CrossRef] 7. Danielsson, A.; Cato, I.; Carman, R.; Rahm, L. Spatial clustering of metals in sediments of Skagerrak/Kattegat. Appl. Geochem. 2009, 14, 489–706. [CrossRef] 8. Zuo, R.; Carranza, E.J.M.; Wang, J. Spatial analysis and visualization of exploration geochemical data. Earth-Sci. Rev. 2016, 158, 9–18. [CrossRef] 9. Cheng, Q. Multifractality and spatial statistics. Comput. Geosci. 1999, 25, 949–961. [CrossRef] 10. Cheng, Q.; Bonham-Carter, G.F.; Hall, G.E.M.; Bajc, A. Statistical study of trace elements in the soluble organic and amorphous Fe-Mn phases of surficial sediments, Sudbury Basin. 1. Multivariate and spatial analysis. J. Geochem. Explor. 1997, 59, 27–46. [CrossRef] 11. Cheng, Q.; Agterberg, F.; Ballantyne, S. The separation of geochemical anomalies from background by fractal methods. J. Geochem. Explor. 1994, 51, 109–130. [CrossRef] 12. Cheng, Q.; Agterberg, F.; Bonham-Carter, G. A spatial analysis method for geochemical anomaly separation. J. Geochem. Explor. 1996, 56, 183–195. [CrossRef] 13. Grunsky, E.C.; Agterberg, F.P. Spatial and multivariate analysis of geochemical data from metavolcanic rocks in the Ben Nevis area, Ontario. Math. Geosci. 1988, 20, 825–861. [CrossRef] 14. Haas, T.C. Multivariate Geostatistics: An Introduction With Applications. Int. J. Rock Mech. Min. Sci. Geomech. Abstr. 2003, 33, 363A. [CrossRef] 15. Krige, D.G. Lognormal-de Wijsian Geostatistics for Ore Evaluation; South African Institute of Mining and Metallurgy: Johannesburg, South Africa, 1981. 16. Swart, P.; Wortmann, U.; Mitterer, R.; Malone, M.; Smart, P.; Feary, D.; Hine, A. Hydrogen sulfide–hydrates and saline fluids in the continental margin of South Australia. Geology 2000, 28, 1039–1042. [CrossRef] 17. Hinton, G.E.; Osindero, S.; Teh, Y.-W. A Fast Learning for Deep Belief Nets. Neural Comput. 2006, 18, 1527–1554. [CrossRef] 18. Deng, L. Automatic Speech Recognition: A Deep Learning Approach; Springer: Berlin, Germany, 2014. 19. Liu, C.; Cao, Y.; Luo, Y.; Chen, G.; Vokkarane, V.; Ma, Y. DeepFood: Deep Learning-Based Food Image Recognition for Computer-Aided Dietary Assessment. In Advances in Nonlinear Speech Processing; Springer Nature: Basingstoke, UK, 2016; pp. 37–48. 20. Lam, A.N.; Nguyen, A.T.; Nguyen, H.A.; Nguyen, T.N. Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports (N). In Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, Lincoln, NE, USA, 9–13 November 2015; pp. 476–481. 21. Beucher, A.; Österholm, P.; Martinkauppi, A.; Eden, P.; Fröjdö, S. Artificial neural network for acid sulfate soil mapping: Application to the Sirppujoki River catchment area, south-western Finland. J. Geochem. Explor. 2013, 125, 46–55. [CrossRef] 22. Zhao, J.; Chen, S.; Zuo, R. Identifying geochemical anomalies associated with Au–Cu mineralization using multifractal and artificial neural network models in the Ningqiang district, Shaanxi, China. J. Geochem. Explor. 2016, 164, 54–64. [CrossRef] 23. Xiong, Y.; Zuo, R. Recognition of geochemical anomalies using a deep autoencoder network. Comput. Geosci. 2016, 86, 75–82. [CrossRef] 24. Chen, Y.; Lu, L.; Li, X. Application of continuous restricted Boltzmann machine to identify multivariate geochemical anomaly. J. Geochem. Explor. 2014, 140, 56–63. [CrossRef] Minerals 2019, 9, 270 21 of 23

25. Sahiner, B.; Chan, H.P.; Petrick, N.; Wei, D.; Helvie, M.A.; Adler, D.D.; Goodsitt, M.M. Classification of mass and normal breast tissue: A convolution neural network classifier with spatial domain and texture images. IEEE Trans. Med. Imaging 1996, 15, 598–610. [CrossRef][PubMed] 26. Hinton, G.; Mohamed, A.-R.; Jaitly, N.; Vanhoucke, V.; Kingsbury, B.; Deng, L.; Yu, D.; Dahl, G.; Senior, A.; Nguyen, P.; et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups. IEEE Process. Mag. 2012, 29, 82–97. [CrossRef] 27. Chen, Y.-H.; Emer, J.; Sze, V. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. In Proceedings of the 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, South Korea, 18–22 June 2016; 2016; Volume 44, pp. 367–379. [CrossRef] 28. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the International Conference on Neural Information Processing Systems, Stateline, NV,USA, 3–8 December 2012; pp. 1097–1105. 29. Xie, J.; Chen, W. The research of face recognition based on convolutional neural network. Softw. Guide 2018, 312–315. 30. Zeiler, M.D.; Fergus, R. Visualizing and Understanding Convolutional Networks; Springer: Berlin, Germany, 2014; pp. 818–833. 31. Nam, H.; Sugiyama, M. Direct Density Ratio Estimation with Convolutional Neural Networks with Application in Outlier Detection. Ieice Trans. Inf. Syst. 2015, 98, 1073–1079. [CrossRef] 32. Wei, L.; Huang, Q.; Ceylan, D.; Vouga, E.; Li, H. Dense Human Body Correspondences Using Convolutional Networks. 2016. Available online: https://openaccess.thecvf.com/content_cvpr_2016/papers/Wei_Dense_ Human_Body_CVPR_2016_paper.pdf (accessed on 28 April 2019). 33. Li, Z.; Fan, Y.; Wang, F. Convolutional autoencoder-based color image classification using chroma subsampling in YCbCr space. In Proceedings of the International Congress on Image & , Shenyang, China, 14–16 October 2015; pp. 351–355. 34. Geng, J.; Fan, J.; Wang, H.; Ma, X.; Li, B.; Chen, F. High-Resolution SAR Image Classification via Deep Convolutional Autoencoders. IEEE Geosci. Sens. Lett. 2015, 12, 1–5. [CrossRef] 35. Badrinarayanan, V.; Handa, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling. arXiv 2015. Available online: https://arxiv.org/pdf/1511.00561.pdf (accessed on 28 April 2018). 36. Chen, K.; Seuret, M.; Liwicki, M.; Hennebert, J.; Ingold, R. Page segmentation of historical document images with convolutional autoencoders. In Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 26–30 September 2015; pp. 1011–1015. 37. Baccouche, M.; Mamalet, F.; Wolf, C.; Garcia, C.; Baskurt, A. Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification. Available online: https://pdfs.semanticscholar.org/7b4e/ 2c4d196735119fe788c01b8010d44e3328e4.pdf (accessed on 28 April 2018). 38. Chen, M.; Shi, X.; Zhang, Y.; Wu, D.; Guizani, M. Deep Features Learning for Medical Image Analysis with Convolutional Autoencoder Neural Network. IEEE Trans. Big Data 2017, 1. Available online: https://www.researchgate.net/publication/317702765_Deep_Features_Learning_for_Medical_Image_ Analysis_with_Convolutional_Autoencoder_Neural_Network (accessed on 28 April 2018). 39. Idris, A.M. Combining multivariate analysis and geochemical approaches for assessing heavy metal level in sediments from Sudanese harbors along the Red Sea coast. Microchem. J. 2008, 90, 159–163. [CrossRef] 40. Reimann, C.; Garrett, R.G. Geochemical background—Concept and reality. Sci. Total. Environ. 2005, 350, 12–27. [CrossRef] 41. Mclennan, S.M. Geochemical Classification of the Elements; Springer: Berlin, Germany, 1998; pp. 282–292. 42. Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for . In Proceedings of the Computer Vision & Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2818–2826. 43. Thompson, E.S.; Saveyn, P.; Declercq, M.; Meert, J.; Guida, V.; Eads, C.D.; Robles, E.S.; Britton, M.M. Characterisation of heterogeneity and spatial autocorrelation in phase separating mixtures using Moran’s I. J. Colloid Interf Sci. 2018, 513, 180–187. [CrossRef] 44. Kessy, A.; Lewin, A.; Strimmer, K. Optimal Whitening and Decorrelation. Available online: https://arxiv.org/ pdf/1512.00809.pdf (accessed on 28 April 2019). Minerals 2019, 9, 270 22 of 23

45. Mei, Y.; Androutsos, D. PCA-whitening CSS shape descriptor for affine invariant image retrieval. In Proceedings of the Canadian Conference on Electrical and Computer Engineering, St. John’s, NL, Canada, 3–6 May 2009; pp. 235–238. 46. Hossain, M. Whitening and Coloring Transforms for Multivariate Gaussian Random Variables. 2016. Available online: https://www.projectrhea.org/rhea/images/1/15/Slecture_ECE662_Whitening_and_Coloring_ Transforms_S14_MH.pdf (accessed on 28 April 2019). 47. Koivunen, A.C.; Kostinski, A.B. The Feasibility of Data Whitening to Improve Performance of Weather Radar. J. Appl. Meteorol. 2010, 38, 741–749. [CrossRef] 48. Matheron, G. Principles of geostatistics. Econ. Geol. 1963, 58, 1246–1266. [CrossRef] 49. Cheng, Q. Spatial and scaling modeling for geochemical anomaly separation. J. Geochem. Explor. 1999, 65, 175–194. [CrossRef] 50. Bellehumeur, C.; Marcotte, D.; Jébrak, M. Multi-element relationships and spatial structures of regional geochemical data from stream sediments, southwestern Quebec, Canada. J. Geochem. Explor. 1994, 51, 11–35. [CrossRef] 51. Bin, J.I.; Zhou, T.; Yuan, F.; Zhang, D.; Liu, L.; Liu, G. A method for identifying geochemical anomalies based on spatial autocorrelation. Sci. Surv. Mapp. 2017, 42, 24–27. 52. Liu, Q.; Xie, W.-J.; Xia, J.-B. Using Semivariogram and Moran’s I Techniques to Evaluate Spatial Distribution of Soil Micronutrients. Commun. Soil Sci. Plan. 2013, 44, 1182–1192. [CrossRef] 53. McKinley, J.M.; Ofterdinger, U.; Young, M.; Barsby, A.; Gavin, A. Investigating local relationships between trace elements in soils and cancer data. Spat. Stat. 2013, 5, 25–41. [CrossRef] 54. Moran, P.A.P. Notes on Continuous Stochastic Phenomena. Biometrika 1950, 37, 17. [CrossRef][PubMed] 55. Anselin, L. Spatial Processes, Models and Applications. Econ. Geogr. 1983, 59, 4. [CrossRef] 56. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [CrossRef] 57. Zeiler, M.D.; Krishnan, D.; Taylor, G.W.; Fergus, R. Deconvolutional Networks. 2010. Available online: https://www.matthewzeiler.com/mattzeiler/deconvolutionalnetworks.pdf (accessed on 28 April 2019). 58. Xie, X.; Mu, X.; Ren, T. Geochemical mapping in China. J. Geochem. Explor. 1997, 60, 99–113. [CrossRef] 59. Zuo, R.G.; Xia, Q.L.; Zhang, D.J. Geological Process-Based Mineral Resource Quantitative Prediction and Assessment for Makeng Type Iron Polymetallic Deposits in Fujian. Earth Sci. 2012, 37, 1183–1190. 60. China Geological Survey. Regional Geological Map of Fujian Province (1: 200,000); Geological Press: Fuzhou, China, 2010. 61. Wang, H.; Cheng, Q.; Zuo, R. Spatial characteristics of geochemical patterns related to Fe mineralization in the southwestern Fujian province (China). J. Geochem. Explor. 2015, 148, 259–269. [CrossRef] 62. Zuo, R. Selection of an elemental association related to mineralization using spatial analysis. J. Geochem. Explor. 2018, 184, 150–157. [CrossRef] 63. Govett, G.J.S.; Goodfellow, W.D.; Chapman, R.P.; Chork, C.Y. Exploration geochemistry—Distribution of elements and recognition of anomalies. Math. Geosci. 1975, 7, 415–446. [CrossRef] 64. Yan, C. A study of the spatial structural features of the regional ceochemical data. Comput. Tech. Geophys. Geochem. Explor. 1991, 4, 289–294. 65. Zou, K.H.; O’Malley, A.J.; Mauri, L.; O’Malley, A.J. Receiver-Operating Characteristic Analysis for Evaluating Diagnostic Tests and Predictive Models. Circulation 2007, 115, 654–657. [CrossRef][PubMed] 66. Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2005, 27, 861–874. [CrossRef] 67. Chen, Y.; Lu, L.; Li, X. Kernel Mahalanobis distance for multivariate geochemical anomaly recognition. J. Jilin Univ. Earth Sci. Ed. 2014, 44, 396–408. 68. Zuo, R.; Xia, Q.; Zhang, D. A comparison study of the C–A and S–A models with singularity analysis to identify geochemical anomalies in covered areas. Appl. Geochem. 2013, 33, 165–172. [CrossRef] 69. Agterberg, F.P.; Bonham-Carter, G.F.; Cheng, Q.; Wright, D.F. Weights of Evidence Modeling and Weighted Logistic Regression for Mineral Potential Mapping. Available online: http://www.ige.unicamp.br/sdm/doc/ documentation/wlr.pdf (accessed on 28 April 2019). 70. Bonhamcarter, G. Geographic information systems for geoscientists: Modelling with GIS. Comput. Methods Geosci. Pergamon 1994, 4, 1–2. 71. Zhang, D.; Cheng, Q. A Comparison of Two Different Unit Division Methods in Weights of Evidence. J. Jilin Univ. 2013, 43, 1040–1051. Minerals 2019, 9, 270 23 of 23

72. Wang, J.; Zuo, R.; Caers, J. Discovering geochemical patterns by factor-based . J. Geochem. Explor. 2017, 181, 106–115. [CrossRef] 73. Baydin, A.G.; Pearlmutter, B.A.; Siskind, J.M. Tricks from Deep Learning. 2016. Available online: https: //arxiv.org/pdf/1611.03777.pdf (accessed on 28 April 2019).

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).