Ecological Indicators 45 (2014) 184–194

Contents lists available at ScienceDirect

Ecological Indicators

jo urnal homepage: www.elsevier.com/locate/ecolind

Spatial prediction of organic matter content integrating artificial

neural network and ordinary in Tibetan Plateau

a,b a a b b,∗

Fuqiang Dai , Qigang Zhou , Zhiqiang Lv , Xuemei Wang , Gangcai Liu

a

College of Tourism and Land Resources, Chongqing Technology and Business University, Chongqing 400067, China

b

Key Laboratory of Mountain Hazards and Earth Surface Processes, Institute of Mountain Hazards and Environment, Chinese Academy of Sciences and

Ministry of Water Resources, Chengdu 610041, China

a r t i c l e i n f o a b s t r a c t

Article history: (SOM) content is considered as an important indicator of . An accurate

Received 20 January 2014

spatial prediction of SOM content is so important for estimating soil organic carbon pool and monitoring

Received in revised form 21 March 2014

change in it over time at a regional scale. Due to the unfavourable natural conditions in Tibetan Plateau,

Accepted 1 April 2014

soil sampling with high density is time consuming and expensive. As a result, little research has focused

on the spatial prediction of SOM content in Tibet because of shortage of data. We used a two-stage

Keywords:

process that integrated an artificial neural network (ANN) and the estimation of its residuals by ordinary

Soil organic matter

kriging to produce accurate SOM content maps based on sparsely distributed observations and available

Digital soil mapping

auxiliary information. SOM content data were obtained from a in Tibet and were used to train

Artificial neural network

and validate the ANN-kriging methodology. Available environmental information including elevation,

Ordinary kriging

Accuracy improvement temperature, precipitation, and normalized difference vegetation index were used as auxiliary variables

in the ANN training. The prediction accuracy of SOM content was compared with those of ANN, universal

kriging, and inverse distance weighting (IDW). A more accurate prediction of SOM content was obtained

−1

by ANN-kriging, with lower global prediction errors (root mean square error = 6.02 g kg ) and higher Lin’s

concordance correlation coefficient (0.75) for validation sampling sites compared with other methods.

Relative improvements of 26.94–37.10% over other methods were observed in the prediction of SOM

content. In conclusion, the proposed ANN-kriging methodology is particularly capable of improving the

accuracy of SOM content mapping at large scale.

© 2014 Elsevier Ltd. All rights reserved.

1. Introduction accurate SOM content map is favoured to estimate the baseline of

the organic carbon pool in soil and monitor the change over time at

Soil organic matter (SOM) content is considered as an important various scales so as to understand responses of the terrestrial sys-

indicator of soil quality which greatly contributes to the ecosystem tem to climate change (Milne et al., 2007) and to verify

services such as food production and carbon sequestration. SOM sequestration (Smith, 2004).

has an important effect on many soil properties and major bio- (DSM) in , also referred to as

geochemical cycles and is a strong indicator of fertility and land predictive soil mapping or , is the creation and

degradation (Manlay et al., 2007). With increasing SOM content, the population of a geographically referenced soil database, gener-

the physical and hydraulic properties of soil are improved, and its ated at a given resolution by using field and laboratory observation

nutrient-holding capacity is enhanced (Benjamin et al., 2008). A methods coupled with environmental data through quantitative

better understanding of SOM content and its spatial variation is relationships (McBratney et al., 2003; Scull et al., 2003). DSM has

essential for using efficiently and maintaining soil produc- paid more attention on mapping SOM content as well as estimating

tivity. Furthermore, soil, being an important sink of atmospheric carbon stocks (Karunaratne et al., 2014). But prediction accuracy

carbon, contains about three times more organic carbon (an essen- of SOM content for the typical physiogeographical area such as

tial part of SOM) than vegetation and about twice as much carbon Tibetan Plateau is too low to help with estimating the soil organic

as the atmosphere (Batjes and Sombroek, 1997). Therefore, an carbon pool and monitoring changes in it over time. Fortunately,

a global project has been formed to make a new digital

of the world using emerging technologies, in view of a need for

∗ accurate and spatially referenced soil information (Sanchez et al.,

Corresponding author. Tel.: +86 28 85231287; fax: +86 28 85238973.

E-mail address: [email protected] (G. Liu). 2009). The need is just consistent with developing a methodology

http://dx.doi.org/10.1016/j.ecolind.2014.04.003

1470-160X/© 2014 Elsevier Ltd. All rights reserved.

F. Dai et al. / Ecological Indicators 45 (2014) 184–194 185

in the study that allows for improvements in accurately predicting ANN can incorporate the spatial autocorrelation of measured values

SOM content at large scale. During the last decades, various DSM which can lead to better predictions and lower error.

methods have been used for mapping SOM content based on soil The Tibetan Plateau has been considered the best laboratory for

observations. Generally, there are two main types of interpolation environmental research because of the prominent responses of the

methods: deterministic and stochastic. The former uses determi- plateau’s biogeochemical cycle to global changes (Bi, 1997). Accord-

nistic techniques, which calculate the unknown values based on ing to the pattern and change of soil organic carbon storage in China,

parameters that control either (i) the extent of the similarity of the SOM content has decreased in southern Tibet due to anthro-

values (e.g., IDW, inverse distance weighting) or (ii) the degree of pogenic activities and natural factors (Wang et al., 2003; Dai et al.,

smoothing of the surface (e.g., RBF, radial basis function). The latter 2011). To scale down this result to a regional level, it is necessary to

are stochastic techniques, which assume that at least some of the develop a method for improving the accuracy of SOM predictions.

spatial variation of natural phenomena can be modelled by random To our knowledge, however, little research has been conducted on

processes with spatial autocorrelation. Kriging is one of the most the spatial prediction of SOM content in the south of Tibet (Li et al.,

widely used methods among the stochastic techniques and is the 2009). In this study, a two-stage process methodology integrating

best linear unbiased estimator in the sense that it minimizes the ANN and kriging was proposed for predicting SOM content. The

variance of the estimation error (Webster and Oliver, 2001). Krig- primary objective is to investigate the improvement in accuracy of

ing methods have shown considerable advantages in the prediction SOM content mapping achieved by integrating ANN and residual

of soil properties including SOM content, compared with deter- estimation by kriging over ANN, universal kriging, and IDW.

ministic interpolation methods (Liu et al., 2008; Schloeder et al.,

2001; Worsham et al., 2010). Although sampling density could be

reduced to a certain degree relative to the deterministic techniques, 2. Materials and methods

the accuracy of the SOM content maps with kriging interpolation

is strongly dependent on the number of soil samples. Further, a 2.1. Study area

large variability in elevation, together with difference in climate

and vegetation, can result in strong local variation of SOM content. This study was conducted in one of seven physiogeographical

In such cases, the reliability of kriging estimates deceases, as site- units in Tibet classified by the Chinese Academy of Sciences (CAS,

specific measurements can be influenced by strong local variation 1982) and is located in the south of Tibetan Plateau with a total area

2

(Alsamamra et al., 2009). Alternatively, cokriging has been used to more than 220,000 km (Fig. 1). Broad valleys and basins are the

overcome this limitation by incorporating auxiliary variables that main types of topography in the area. It is characterized by plateau

may compensate for the lack of data and the small sample size cold monsoon and semi-arid climate, with a mean annual temper-

(Odeh et al., 1995). This method is valuable when the auxiliary ature of 1–7.5 C, a mean annual relative humidity of 40–50%, and a

variable is highly correlated with the SOM content and is sam- mean annual precipitation of 200–500 mm represent the ranges of

pled intensely (Wu et al., 2009). Previous studies have shown that means for different portions of the study area. According to Chinese

environmental variables, such as terrain attributes (Mueller and soil classification, the main soil types are mountain shrub steppe

Pierce, 2003), remotely sensed data (Wu et al., 2009), and climate soils and alpine steppe soils ( in the USDA Soil Taxonomy

data (Mishra et al., 2010), have had the potential as useful aux- and in the FAO World Reference Base for Soil Resources).

iliary variables for improving the accuracy and reliability of SOM This area is an important agricultural zone in Tibet where the dom-

prediction. Instead of directly including the auxiliary variables in inant is irrigated cropping systems producing grains and

the kriging process, regression kriging was proposed which con- crops.

siders the auxiliary environmental data as independent variables

in multiple linear regression and then incorporates the predic-

tion of soil attributes and the kriging of residuals (Frogbrook and 2.2. Soil observations and environmental variables

Oliver, 2001; Piccini et al., 2014; Takata et al., 2007; Zhao et al.,

2014). The accuracy of these methods is still dependent on the A total of 244 soil samples (0–25 cm) were collected over the

density and size of sampling sites, as these methods are based study area in 2007 (Fig. 1), taking , lithologic characters,

on interpolation, which requires some data as inputs. Therefore, vegetation, and topography into consideration. A large sampling

a more efficient method is required to improve the accuracy of grid size was designed because of the extent of the study area and

interpolation methods for producing high-resolution SOM content its unfavourable natural conditions. Sampling sites were selected

2

maps. on a 64 × 64 km grid for forest, pasture, and waste land and on an

2

×

Essentially, the non-linearity and multicollinearity problems 8 8 km grid for arable land. The longitude and latitude of each

existed in the interpolation process of both cokriging and multiple sampling site were recorded with a global positioning system (GPS).

linear regression (Alvarez et al., 2011; Gautam et al., 2011). To over- A portion of 3 kg of soil at each sampling site, taken from composite

come these problems along with the limited soil observations and samples, was air dried and sieved to pass a 2-mm mesh to deter-

available auxiliary information, linear mixed model (Karunaratne mine SOM content with wet oxidation with external heat applied

et al., 2014), support vector regression (SVR) (Ballabio, 2009), and (Liu, 1996; Mebius, 1960). In the procedure, SOM was oxidized by

artificial neural network (ANN) (Malone et al., 2009; Zhao et al., potassium dichromate and heated to about 170–180 C for five min-

2010) were introduced to construct the relationships between SOM utes, and then excess potassium dichromate was determined by

−1

content and other environmental variables and thus to produce titration with standard 0.2 mol L ferrous sulfate (FeSO4). The SOM

more accurate SOM content maps. Further, fewer studies used content was estimated by multiplying the amount of organic car-

other machine learning approaches like Cubist (Lacoste et al., 2014), bon by the conventional carbon factor (1.724) (Read and Ridgell,

boosted regression tree (Martin et al., 2011), and Random Forest 1922).

(RF) (Subburayalu and Slater, 2013) to predict SOM in order to In the study area, a (DEM) with a

improve prediction accuracy. However, a major disadvantage of spatial resolution of 90 m was obtained from the Environmen-

these methods is that the SOM content at a particular grid node is tal and Ecological Science Data Center for West China, National

derived only from auxiliary information at each point, regardless Natural Science Foundation of China. Low-elevation areas were

of the spatial autocorrelation of the surrounding measured data located in the valleys of the middle reaches of the Yarlung Zangbo

(Takata et al., 2007). Integrating ordinary kriging of residuals from River, whereas high-elevation areas were located in the south of

186 F. Dai et al. / Ecological Indicators 45 (2014) 184–194

Fig. 1. Location of study area in China, the elevation in meters above sea level, and soil sampling sites including (a) training sites (n = 163) and (b) validation sites (n = 81).

the Gangdise–Nyainqentanglha Mountains and the north of the (Fig. 2). A RBF network, as a multi-layer feed-forward ANN, was

Himalayas. applied to estimate the SOM content at the training sites. Com-

Climate data with 500 m spatial resolution, including mean pared to back-propagation networks, the architecture and training

annual temperature and mean annual precipitation, were obtained algorithms of RBF networks are simpler and clearer (Lin and Chen,

from the Data Sharing Network of Earth System Science, China. The 2004; Gautam et al., 2011). Broomhead and Lowe (1988) were the

climate variables were mapped by the IDW method based on the first to exploit the use of RBFs in the design of neural networks. RBF

observed data from the climate stations over the area. In general, networks typically have three layers, that is, an input layer, a hidden

the mean annual temperature decreased with increases in eleva- layer with a non-linear RBF activation function and a linear output

tion, while a higher mean annual temperature was observed in layer, with a number of neurons in each. Such a network possesses

the valley area of the middle reaches of the Yarlung Zangbo River. self-organizing characteristics that allow for adaptive determina-

The mean annual precipitation decreased from the southeast to the tion of the hidden neurons during training of the network (Zhang

northwest. The valley area in the east of the Yarlung Zangbo River and Kushwaha, 1999). Each input neuron is completely connected

had more precipitation than the other areas, especially west of the to all hidden neurons, and hidden neurons and output neurons

source zone of Yarlung Zangbo River. are also interconnected to each other by a set of weights. In this

The normalized difference vegetation index (NDVI) at 1000 m study, information fed into the network through four input neu-

spatial resolution was derived from MODIS image in August rons, including elevation, temperature, precipitation, and NDVI,

Y T

2007 in the International Scientific Data Service Platform, Com- = [y1, y2, y3, y4] , is transmitted to the hidden neurons; each

puter Network Information Center, Chinese Academy of Sciences hidden neuron then transforms the input neuron using a trans-

(http://datamirror.csdb.cn). The NDVI values were produced by fer function . The number of hidden neurons is selected based on

using the 16-day MODIS vegetation index and the composite sur- testing and validation of the RBF network’s performance. The out-

face reflectance products as input. In the study area, the NDVI values put of a hidden neuron consisting of one neuron corresponding to

increased from west to east. the SOM content has the form of a RBF. The output of each hidden

A GIS database within ArcGIS was created for the study area neuron, ht, is given by

which contains spatial and attributive data to predict SOM content,

h y =   ,

y − c

namely the soil sample maps and environmental variable maps. t ( s) ( s t ) (1)

All digitized layers from different sources were organized with an

where ys is the sth input neuron.

identical projection system and the environmental variable maps

For the present model, a Gaussian function was selected for the

were resampled within ArcGIS at 2000 m spatial resolution, which

RBF network. The Gaussian is a positive radial symmetric function

is consistent with that of the SOM content prediction map. A near-

(kernel) with centre ct and width . The width is the radial distance

est neighbor assignment method was used for resampling climate

from the center of kernel within the value of the function differs

data and NDVI, while a bilinear interpolation was performed for

significantly from zero (Sun et al., 2009). An input pattern falling

DEM which determines the new value of a cell based on a weighted

within the receptive field will cause a significant response. For each

distance average of the four nearest input cell centers.

input pattern, the hidden neurons compute the distance between

the input neuron and the center of the receptive field. For a Gaussian

2.3. Spatial prediction of SOM content by radial basis function

function, the response is unity if this distance is zero, and decays

network

to zero when the distance is greater than the width. The response

of the hidden neuron to an input neuron Y is given by

The 244 sampling sites were divided into two groups: two-

   

thirds of the sampling sites (n = 163) were used to train the ANN  2

Y − ct

model and one-third (n = 81) were used for validation. This parti- Y =

( ) exp − , (2)

22

tion was conducted randomly by using the “create subset” function

of Geostatistical Analyst in ArcGIS 9.2.

 

To improve the accuracy of SOM content mapping, ANN was where ( ) is the Gaussian activation function, denotes the

used first to estimate the SOM content with auxiliary variables Euclidean norm, ct is the centre of the tth neuron in the hidden

F. Dai et al. / Ecological Indicators 45 (2014) 184–194 187

Fig. 2. Flow diagram of soil organic matter (SOM) content mapping integrating artificial neural network (ANN) and residual estimation by ordinary kriging. NDVI is the

normalized difference vegetation index.

Z x

layer, and is the width of the hidden neuron. can be computed between ( i) and the extrinsic factors was removed as a result

by (Haykin, 1994) of ANN prediction.

r x , r x , . . ., r x x , x , . . ., x d Given the residuals ( 1) ( 2) ( N ) at sites 1 2 N ,

max ,

= √

(3) the value of a random variable r is estimated by kriging. Ordinary

2T

kriging, the most common type of kriging in practice is a least-

d

where max is the maximum distance between centres of the hidden squares method of spatial prediction based on the assumption of

neurons and T is the number of hidden neurons. an unknown mean. The ordinary estimate is given by Isaaks and

The responses are multiplied by the interconnected weights w Shrivastava (1989):

between the hidden and output layers. Each unit in the output layer n

then makes a linear transformation on the data from the hidden

r x = r x ,

ˆOK( 0) i ( i) (6)

layer. For the output layer, the output can be obtained from i=1

m

r x where ˆOK ( 0) is the estimated r at a site x0 with ordinary kriging, Zˆ = w (Y), (4)

ANN t t

= i are the optimal weights under the condition that i 1, and

t=1

r x x

( i) is the ANN residual value at site i.

Z

where ˆANN represents the SOM content estimated by the ANN, wt Spatial patterns are usually described using the experimen-

h

is the connecting weight between the tth hidden neuron and the tal semivariogram ˆ ( ), which measures the average dissimilarity

output neuron, and t(Y) is the response of the tth hidden neuron between data separated by the lag distance h (Goovaerts, 1997).

resulting from all input data. It is calculated as half the average squared difference between the

In a RBF network there are three types of parameters that need components of data pairs,

to be chosen to adapt the network for a particular task: the centre

N(h)

vectors ct, the output weights wt, and the RBF width parameters h 1 r x 2

= − r x + h ,

ˆ ( ) [ ( i) ( i )] (7)

. The training of an RBF network involves a learning algorithm, 2N(h)

i=1

which enables the network to adapt its behaviour to a given rela-

tionship between input and output (Schmitz et al., 2005). Training where N(h) is the number of pairs of sampling sites separated by a

consists of the following steps: (i) the RBF network is simulated, (ii) lag distance h. This model is then used for the kriging interpolation

the input vector with the greatest error is found, (iii) a radial basis of ANN residuals. The GS+ 7.0 geostatistical software was used to

transfer function is added with weights equal to that vector, (iv) the predict the residuals for the study area using ordinary kriging.

Z x

linear transfer function layer weights are redesigned to minimize The final estimated SOM content ˆ( i) was obtained as the sum

Z x

the error. The steps above are repeated until the network’s mean of the ANN estimates ˆANN( i) and ordinary kriging estimates of the

r x

squared error falls below the target. To train the RBF network, start- residuals ˆOK( i) as follows:

ing with zero neurons, the hidden neurons are added one by one

Z x = Z x + r x .

ˆ( i) ˆANN( i) ˆOK( i) (8)

until the best value is found that performed well in both training

and testing. To verify the method proposed in this study, both stochastic

interpolation methods (e.g., universal kriging) and deterministic

2.4. Estimation of residuals by ordinary kriging interpolation methods (e.g., IDW) were conducted, and then the

prediction accuracies were compared. In this study, interpolation

In the residual kriging procedure, the ANN prediction was car- methods were implemented using the GS+ 7.0 geostatistical soft-

ried out first. A residual error can be defined as follows: ware, while the ANN was trained using neural network toolbox in

MATLAB 7.0.

r x = Z x − Z x ,

( i) ( i) ˆANN( i) (5)

r x x Z x

where ( i) is the residual at site i, ( i) is the measured value, and 2.5. Evaluation of the accuracy of interpolation methods

Z x

ˆANN( i) is the value estimated by ANN.

r x

The new variable ( i) retains the spatial variability determined Generally, the cross-validation method is used to assess which

by intrinsic factors, whereas most of the complex relationship interpolation method gives the best performance. In this study,

188 F. Dai et al. / Ecological Indicators 45 (2014) 184–194

the accuracy of interpolation for SOM content mapping was eval-

uated using an independent data set, which is more stringent than 0.714 0.097 0.656 0.320 0.294 0.151 0.564

NDVI 46.99

the common cross-validation procedure (Alsamamra et al., 2009). −

From the predicted SOM content maps, the values were extracted

for the validation sites. Three validation indices defined by Isaaks

and Shrivastava (1989) were used to evaluate the performance of (mm)

different interpolation methods, that is, the mean error (ME), mean

absolute error (MAE) and root mean square error (RMSE), computed

as follows:

n 0.935 0.022 4.1    6.4 3.4 4.4 2.6 Precipitation 77.06 − −

1 −

= ,

ME Zˆ x − Z x (9) n ( i) ( i) i=1

C)

n ◦

 (

1

MAE = Zˆ x ) − Z(x ) , (10) n ( i i

i=1

2.178 10.939 n 2 77.5 21.51 360.5  Temperature 223.3 787.9 348.3

1  

= Zˆ x − Z x , RMSE n ( i) ( i) (11)

i=1 (m)

Zˆ x Z x

where ( i) is the estimated SOM content, ( i) is the measured

SOM content, and n is the number of validation sites. 0.265 0.805 9.67

The relative improvement (RI) of ANN-kriging over the other Elevation 393 81)

5095 4066 3555 3962 =

interpolation methods for the validation sites was used as another n (

index for accuracy evaluation, and was defined as (Mishra et al.,

) 1 sites

2010) − kg

− RMSE RMSE (g

int ANN-OK

RI = × 100%, (12)

RMSE 0.831 0.765 int 2.40 20.47 10.74

Validation SOM 52.33 21.62 49.66

where RMSEANN-OK and RMSEint are the root mean square errors

of the ANN-kriging method proposed in this study and a given

interpolation method, respectively.

0.017 0.614 0.792 0.342 0.335 0.160 0.238

The Lin’s concordance correlation coefficient (CCC) was cal- NDVI 46.62

− −

culated to measure the agreement between the measured and

estimated SOM content. The CCC indicates the degree to which ◦

sites.

pairs of the measured and estimated SOM content fall on the 45

(mm)

line through the origin. It can be defined as (Lin, 1989)

2xy sampling

c = (13) 2 2 2

+

+ −

( x y) the

x y 0.758 0.570 6.4 3.2 4.2 2.6 3.7 at

Precipitation 81.99 − −

where c is the estimated CCC, is the Pearson correlation coef-

ficient between the measured and estimated SOM content, x and

are the corresponding variances of the measured and estimated variables

y C) ◦ (

SOM content, x and y are the means for the measured and esti- mated SOM content.

auxiliary

the 1.594

3. Results 63.0 17.30 12.111 Temperature 188.7 781.9 364.1 369.0 and

3.1. Descriptive statistics of the measured SOM content

content

(m)

Table 1 summarizes the main statistics of the SOM content and

the auxiliary variables at the training sites and validation sites in the (SOM) 0.473 0.695

10.77

study area. Similar to the distribution of SOM content at both train- 441 5390 4091

3275 3943

ing sites and validation sites, the SOM content of all sampling sites 163)

matter =

was positively skewed (skewness = 0.828) and showed a leptokur- n (

) Elevation

tic distribution (kurtosis = 0.434), which is consistent with previous 1 − organic

sites

studies (Marchetti et al., 2012; Pei et al., 2010; Zhang et al., 2012). kg

(g

soil This distribution occurred because some sampling sites in the east-

of

0.845 0.322 ern region of the study area had large SOM content values. To ensure 4.18 20.78 10.49

Training SOM 53.66 22.90 45.83

the reliability of the spatial interpolation, these observations were

included in the study. The SOM content of all sampling sites ranged statistics

−1 −1

from 2.40 to 53.66 g kg , with a mean of 22.47 g kg and standard

1

deviation of 10.57 g kg . This variation may be attributed to the 1

(%)

extent of the study area with large differences in elevation and cli- Statistic Min. Max. Mean Median SD CV Kurtosis Skewness Table

Descriptive mate. Because of the skewness of the distribution, the median value

F. Dai et al. / Ecological Indicators 45 (2014) 184–194 189

Fig. 3. Root mean square error (RMSE) and mean absolute error (MAE) of a neural

network using radial basis function under different neuron number in hidden layer.

−1

(20.60 g kg ) was more representative of the average SOM content

in the study area than the arithmetic mean (Mishra et al., 2010).

3.2. Model performance of RBF network

As mentioned earlier, a RBF network was trained with the input

of elevation, temperature, precipitation, and NDVI and measured

SOM content from training sites. In particular, it is worth noting

that the optimal number of hidden neurons was determined based

on the accuracy of SOM content prediction for validation sites. The

performance of the ANN model trained with RBF as activation func-

tions was evaluated while changing the number of neurons in the

Fig. 4. Semivariograms (points) and fitted models (lines) of (a) residuals from arti-

hidden layer from 1 to 30.

ficial neural network for soil organic matter (SOM) content estimation, and (b) the

The accuracy of the RBF network for various numbers of neu-

log-transformed SOM content from the training sites.

rons in the hidden layer is shown in Fig. 3. The results revealed

that the optimized RBF network architecture was 4-6-1, indicat-

For the training sites, residuals were obtained using Eq. (5)

ing that there were four input nodes in the input layer, six nodes

based on the results of ANN prediction. The probability distribu-

in the hidden layer, and one node in the output layer. When there

tion of the residuals at training sites showed a good fit with a

were fewer than six hidden layer neurons, the net scale was too

normal distribution (skewness = 0.429, kurtosis = 0.265). The result

small, and the prediction accuracy kept decreasing with increasing

of performing a Kolmogorov–Smirnov (K–S) test indicates that this

RMSE and MAE. On the other hand, the model could be over-fitted

variable has a normal distribution (p > 0.05). Therefore, the resid-

when the number of hidden layer neurons was greater than six.

uals were directly used to calculate experimental semivariograms

In other words, over-fitted RBF network models would generate a

for ordinary kriging interpolation. The parameters of experimen-

model with poor generalization and could not yield accurate pre-

tal semivariograms were estimated by the weighted least-squares

diction with input data other than the original training set (Chan

method (Webster and Oliver, 2001). In such a case, the isotropic

et al., 2006; Zhao et al., 2009). The training accuracy of the model

exponential model was found to be suitable to fit the experimental

was high, but the RMSE and MAE of the estimated SOM content for

semivariograms in this study.

the validation sites kept increasing with decreasing accuracy. It is

The parameters of the best-fit model are shown in Table 2. Fig. 4a

concluded that the performance of a RBF network could be signifi-

shows the experimental semivariogram with the fitted model,

cantly affected by the network architecture. In this study, the best

determining the spatial dependence of the residuals in the study

RBF network for predicting SOM content was the model with six

area. The nugget/sill ratio was used to define distinct classes of

hidden layer neurons.

spatial dependence for the SOM content as follows (Cambardella

As shown in Fig. 3, for the best RBF network model, the values

−2 −2 et al., 1994): if the ratio was ≤25%, the variable was considered

of RMSE and MAE are 9.20 g kg and 7.03 g kg , respectively, for

to be strongly spatially dependent; if the ratio was between 25%

SOM content. This prediction performance was comparable to or

and 75%, the variable was considered to be moderately spatially

better than the regression reported by Zhao et al. (2010) and Alvarez

dependent; and if the ratio was >75%, the variable was considered

et al. (2011).

to be weakly spatially dependent. In this study, the nugget/sill ratio

showed moderate spatial dependence of residuals at training sites.

3.3. Spatial prediction of SOM content Additionally, interpolation methods including universal krig-

ing and IDW were conducted to predict the SOM content in the

An RBF network model with a 4-6-1 structure net was firstly study area. The probability distribution of the SOM content at the

used to predict the SOM content map for the study area. The result training sites was positively skewed and leptokurtic (Table 1). The

was the sum of the estimated SOM content by the RBF network and result of performing the K–S test indicates that this variable has a

the residuals estimated by ordinary kriging. non-normal distribution (p > 0.05). However, the log-transformed

190 F. Dai et al. / Ecological Indicators 45 (2014) 184–194

Table 2

Parameters of the models fitted to the semivariograms for soil organic matter (SOM) content and residual interpolation at the training sites.

2

Variogram Model Range (km) Nugget Sill Nugget/sill (%) R

SOM Spherical 737 0.16 0.33 48.48 0.690

a

SOM residual Exponential 2040 50.20 107.16 46.85 0.711

a

SOM residual, residual of SOM content from artificial neural network prediction.

SOM content at training sites showed a good fit with a normal dis- value indicates that the SOM content mapping using ANN-kriging

tribution (skewness = −0.462, kurtosis = 0.526) and passed the K–S clearly shows a better agreement along the 45 degree line between

test for normality (p > 0.05). To minimize the effects of extreme the measured SOM content with the estimated values. Addition-

outliers, therefore, the log-transformed SOM content was used ally, Table 4 shows the descriptive statistics of the predicted SOM

to calculate experimental semivariograms for universal kriging. content for the validation sites obtained by different interpolation

Fig. 4b shows the semivariogram for log-transformed SOM con- methods. The mean values for the validation sites are fairly well

tent with the fitted models, which were found to be spherical in reproduced by the methods, whereas more substantial differences

this instance. The model parameters of the experimental semi- are found for the other descriptive statistics. In short, it seems that

variograms are given in Table 2. Similar to the residual kriging, most of the improvement provided by ANN-kriging compared to

the nugget/sill ratio revealed moderate spatial dependence in the the other methods is associated with a better estimation of both

semivariogram for log-transformed SOM content. the lower values and the higher values.

From the maps of predicted SOM content produced by the inter-

polation methods conducted in this study (Fig. 5), it was found

4. Discussion

that the SOM content had large spatial variation in the study area.

Differences were clearly found in the local variation of the interpo-

This study investigated the prediction ability of regional SOM

lation maps of the estimated SOM content. The maps of the SOM

content mapping. The methods considered were either based on

content produced by ANN-kriging and ANN were significantly influ-

soil observations alone (i.e., pure interpolations) or on both soil

enced by the auxiliary variables and revealed more detail than other

observations and information on auxiliary variables (i.e., ANN),

interpolation methods such as in the east of the study area.

while a hybrid method was developed by combining kriging with

ANN. Comparison of results shows that ANN-kriging provides bet-

3.4. Assessment of accuracy of interpolation methods ter prediction of SOM content with improved accuracy, relative to

other methods. This is consistent with the results of Shen et al.

Table 3 shows the accuracy assessment indices of the estimated (2004), who reported that neural network ensemble residual krig-

SOM content at validation sites for all prediction methods. The ME, ing provided good estimation accuracy for soil properties at a large

which indicates the mean bias of the predictions, changed from scale and more detail with contour maps than kriging. McBratney

−1

an underestimation ( 0.27 g kg ) using ANN-kriging to an over- et al. (2000) also reported that although different techniques pro-

−1

estimation (0.75 g kg ) using universal kriging. However, the ME duced different error in spatial prediction of soil properties, hybrid

was close to zero with ANN and IDW predictions. The MAE, which methods such as non-linear methods with offer pow-

indicates the extent to which the process leads to error, was lower erful spatial prediction methods, especially up to large scale.

−1

with ANN-kriging (4.93 g kg ) and higher with the other methods, Without considering the ANN method used in this study, inter-

−1 −1

ranging from 6.46 g kg (ANN) to 7.16 g kg (IDW). polation methods performed poorly for the SOM content mapping,

−1 2

As a joint measure of bias in the mean and in the variance, with RMSEs of about 9 g kg and R of about 0.2. In a study by

the RMSE values were greater than the MAE values for all of the Schloeder et al. (2001), kriging was expected to perform much bet-

methods. This means that a contribution of the errors exists in the ter than any of the other pure interpolation methods such as IDW,

spatial variability. Compared with the other methods, a lower RMSE especially because of the low sampling density and the high spatial

−2

(6.02 g kg ) suggest that the ANN-kriging produces less error in structure of the SOM data. One possible explanation of the poor per-

predicting the SOM content, with the relative improvement (RI) in formance of universal kriging might be the strong local variation in

RMSE ranging from 26.94% to 37.10%. The accuracy of ANN inter- the SOM content with the variability in the environmental condi-

polation was likely improved by residual estimation by kriging. In tions in the study area. The spatial variation of SOM content may be

short, ANN-kriging provides better estimates of SOM content in affected by intrinsic factors (such as the parent soil material and the

comparison with the other methods. ) and extrinsic factors (such as topography, climate, and

To further analyse the interpolation accuracy, the predicted val- land use). Generally, strongly spatially dependent properties may

ues of the SOM content were plotted against the measured values be determined by intrinsic variations in soil characteristics, while

for the validation sites (Fig. 6). A higher CCC (0.75) was obtained extrinsic variations may control the variability of these weakly

with ANN-kriging in comparison with the other methods. This CCC spatially dependent parameters (Cambardella et al., 1994). SOM

content exhibited moderate spatial dependency in the study area,

indicating that the spatial variation of the SOM content was deter-

Table 3

a −1 mined by both intrinsic factors and extrinsic factors. Furthermore,

Accuracy assessment indices of predicted soil organic matter content (SOM, g kg )

using different interpolation methods from validation sites. spatially dependent parameters had a wide range of values in this

study, reflecting the non-linear dependence of the experimental

ME MAE RMSE CCC RI (%)

variogram on the trend parameters. The ranges of residuals tend to

b

ANN-kriging −0.27 4.93 6.02 0.75

be longer than those of raw data, indicating that the trend could be

ANN 0.09 6.46 8.24 0.39 26.94

removed (Hengl et al., 2004; Takata et al., 2007). The same results

Universal kriging 0.75 6.83 9.03 0.28 33.33

b were also reported by Mishra et al. (2010) and Yang et al. (2010).

IDW 0.12 7.16 9.57 0.20 37.10

a Accordingly, a long range indicates that the soil property is influ-

ME, mean error; MAE, mean absolute error; RMSE, root mean square error; CCC,

enced by large spatial dependence factors such as vegetation over

Lin’s concordance correlation coefficient; RI, relative improvement of prediction

accuracy. greater distances than soil properties that have a shorter range and

b

ANN, artificial neural network; IDW, inverse distance weighting. the estimates of the range may reveal the distance across distinct

F. Dai et al. / Ecological Indicators 45 (2014) 184–194 191

−1

Fig. 5. Predicted soil organic matter content (SOM, g kg ) maps using (a) ANN-kriging (artificial neural network), (b) ANN, (c) universal kriging, (d) IDW (inverse distance

weighting).

soil types (Lopez-Granados et al., 2002; Webster, 1985). Therefore, variables in ANN prediction. This can be ascribed to the nonlinear

it was difficult to use universal kriging to perfectly depict the spatial modality of responses and the complex interaction between input

variation in the SOM content throughout the study area. To over- variables (Gupta et al., 2003). The complexity of the network archi-

come this problem, the incorporation of auxiliary information or tecture was adequate because the validation showed the ability

combining other methods with kriging was used to alleviate some of the model developed to generalize predictions. When the data

of these restrictions (Zhao et al., 2009). available for training are scarce, or the number of neurons in the

The SOM content was not highly correlated with environmen- hidden layer is very high, network models tend to overlearn (Rogers

tal variables in the study area. The Pearson correlation coefficients and Dowla, 1994). Despite the possibly greater simulation accuracy

between the measured SOM content and environmental variables of complex models, simpler ones show better predictions in some

were very low, that is, 0.230 (p < 0.01) for elevation, 0.165 (p < instances; thus a balance must be maintained between complex-

.

0 01) for precipitation, −0.219 (p < 0.01) for temperature, and ity and prediction accuracy (Özesmi et al., 2006). The RBF network

0.201 (p < 0.01) for NDVI. This is contrary to previous studies (Chai fitted here was a simple model, which is in accordance with theo-

et al., 2008; Zehetner and Miller, 2006), which reveal a high cor- retical expectations. Despite their better prediction ability, neural

relation between SOM content and elevation, but are consistent networks could not significantly improve the prediction accuracy

with other studies (Simbahan et al., 2006). This low correlation is from validation sites. This result confirms that the spatial variation

likely caused by the large variability of elevation across the study of SOM content is determined by both intrinsic factors and extrin-

area, as well as the derived variability in climate and vegetation. sic factors. Therefore, a better methodology should be developed to

In general, the relationships between SOM content and other envi- improve the accuracy of SOM content mapping, taking both intrin-

ronmental variables are rarely linear in nature (Zhao et al., 2010). sic and extrinsic factors into consideration. By comparison with RF,

Good performance and ability were obtained from previous stud- SVR and ANN, a combined method with the introduction of ANN

ies in terms of improving the accuracy of SOM content mapping into linear model or residual kriging helps achieve good predic-

with machine leaning methods, such as ANN (Guo et al., 2013), tions and interpretability of SOM content mapping (Ji et al., 2012;

boosted regression tree (Martin et al., 2011), Cubist (Lacoste et al., Shen et al., 2004).

2014), RF (Wiesmeier et al., 2011), and SVR (Ballabio, 2009). In The non-linearity of the correlations between SOM content and

this study, however, there was a slight improvement when eleva- environmental variables together with the small number of avail-

tion, precipitation, temperature, and NDVI were used as auxiliary able observations were two major limitations for accurate SOM

Table 4

−1

Descriptive statistics of the predicted soil organic matter content (SOM, g kg ) from the validation sites by different interpolation methods.

Min. Max. Mean SD CV (%)

a

ANN-kriging 2.42 53.48 21.35 8.63 40.42

ANN 4.23 50.82 21.71 6.26 28.83

Universal kriging 13.11 40.10 22.36 5.67 25.36

a

IDW 11.75 38.19 21.74 5.20 23.94

a

ANN, artificial neural network; IDW, inverse distance weighting.

192 F. Dai et al. / Ecological Indicators 45 (2014) 184–194

−1

Fig. 6. Scatterplots of the measured and estimated soil organic matter content (SOM, g kg ) from validation sites using (a) ANN-kriging (artificial neural network), (b) ANN,

(c) universal kriging, (d) IDW (inverse distance weighting).

content mapping in the study area. To deal with these limitations, a ANN-kriging outperformed other methods in the ability to increase

two-stage process methodology for spatial prediction and residual prediction accuracy at large scales.

estimation was used in this study. In the first stage, ANN was used

to remove the correlation between the SOM content and environ- 5. Conclusions

mental variables and to retain the control of the spatial variation

of the SOM content by intrinsic factors. Then, ordinary kriging was In this study, DSM was conducted for SOM content in the Tibetan

used to estimate the residuals of ANN. As a result, the methodology Plateau where less information was known about its spatial dis-

significantly improved the accuracy of SOM content mapping. In tribution. A two-stage process was proposed that incorporated a

particular, the defects of other interpolation methods, concerning RBF network with residual estimation by ordinary kriging to pre-

the overestimation in the lower values and the underestimation in dict SOM content at a regional scale. A comparison of ANN-kriging

the higher values, were corrected to a certain extent with account- with the other interpolation methods was performed to assess the

ing for the effects of both intrinsic and extrinsic factors. This finding prediction accuracy. The results indicated that the optimized RBF

confirms that the RBF network can be used to interpret the non- network architecture to predict SOM content was one with four

linear relationships between soil characteristics and environmental input nodes, six hidden nodes, and one output node. The isotropic

variables (Zhao et al., 2009) and ordinary kriging can be used to exponential model was found to be suitable to fit the experimen-

depict the spatial variation of the SOM content that is controlled tal semivariograms which shows moderate spatial dependence of

by intrinsic factors (Cambardella et al., 1994). This can also be seen the residuals from ANN prediction. The maps of the SOM content

−2

from the low RMSE (6.02 g kg ) and the high CCC (0.75) in Table 3. produced by ANN-kriging and ANN revealed large spatial variation

These values are comparable to the accuracies obtained by Chai and more detail than universal kriging and IDW in the east of the

et al. (2008) and Stevens et al. (2010), who spatially predicted soil study area.

organic carbon content at regional scale using regression kriging The accuracy of spatial prediction was raised by integrating

and airborne imaging spectroscopy respectively. Compared to val- ANN and ordinary kriging, with lower RMSE and higher Lin’s

ues reported by Liu et al. (2008) and Pei et al. (2010), RMSE was concordance correlation coefficient for validation sampling sites

low and CCC was high in this study. In summary, comparison of the compared with other methods. Relative improvements in RMSE

ANN-kriging method with pure interpolation methods shows that ranging from 26.94% to 37.10% over other methods were observed

F. Dai et al. / Ecological Indicators 45 (2014) 184–194 193

in the prediction of SOM content. In this context, the ordinary krig- Lin, G.F., Chen, L.H., 2004. A spatial interpolation method based on radial basis func-

tion networks incorporating a semivariogram model. J. Hydrol. 288, 288–298.

ing reduced the prediction errors for SOM content mapping by

Lin, L.I., 1989. A concordance correlation coefficient to evaluate reproducibility. Bio-

estimating the residuals of ANN prediction. The results confirm that

metrics 45, 255–268.

ANN-kriging can be used as a method to improve the accuracy of Liu, G.S., 1996. Soil Physical and Chemical Analysis & Description of Soil Profiles.

Chinese Standard Press, Beijing (In Chinese).

spatial prediction for SOM content mapping and help to advance

Liu, X.M., Zhao, K.L., Xu, J.M., Zhan, M.H., Si, B., Wang, F., 2008. Spatial variability of

other digital soil mapping efforts at large, regional scales in such

soil organic matter and nutrients in paddy fields at various scales in southeast

global soil map project. China. Environ. Geol. 53, 1139–1147.

Lopez-Granados, F., Jurado-Exposito, M., Atenciano, S., Garcia-Ferrer, A., de la Orden,

M.S., Garcia-Torres, L., 2002. Spatial variability of agricultural soil parameters in

Acknowledgments southern Spain. Plant Soil 246, 97–105.

Malone, B.P., McBratney, A.B., Minasny, B., Laslett, G.M., 2009. Mapping continuous

depth functions of soil carbon storage and available water capacity. Geoderma

The authors are grateful for financial support from the follow-

154, 138–152.

ing projects: the National Natural Science Foundation Committee

Manlay, R., Feller, C., Swift, M., 2007. Historical evolution of soil organic matter con-

(Grant no. 41301351, 41101503, 41101155), Scientific Research cepts and their relationships with the fertility and sustainability of cropping

systems. Agric. Ecosyst. Environ. 119, 217–233.

Foundation of Chongqing Technology and Business University

Marchetti, A., Piccini, C., Francaviglia, R., Mabit, L., 2012. Spatial distribution of soil

(2013-56-05) and “strategic priority research program – climate

organic matter using geostatistics: a key indicator to assess soil degradation

change: “carbon budget and related issues” of the Chinese Academy status in central Italy. 22, 230–242.

Martin, M.P., Wattenbach, M., Smith, P., Meersmans, J., Jolivet, C., Boulonne, L.,

of Sciences (Grant no. XDA05050506).

Arrouays, D., 2011. Spatial distribution of soil organic carbon stocks in France.

Biogeosciences 8, 1053–1065.

References McBratney, A.B., Odeh, I.O.A., Bishop, T.F.A., Dunbar, M.S., Shatar, T.M., 2000.

An overview of pedometric techniques for use in soil survey. Geoderma 97,

293–327.

Alsamamra, H., Ruiz-Arias, J.A., Pozo-Vazquez, D., Tovar-Pescador, J., 2009. A com-

McBratney, A.B., Santos, M.L.M., Minasny, B., 2003. On digital soil mapping. Geo-

parative study of ordinary and residual kriging techniques for mapping global

derma 117, 3–52.

solar radiation over southern Spain. Agric. For. Meteorol. 149, 1343–1357.

Mebius, L.J., 1960. A rapid method for the determination of organic carbon in soil.

Alvarez, R., Steinbach, H.S., Bono, A., 2011. An artificial neural network approach for

Anal. Chim. Acta 22, 120–124.

predicting soil carbon budget in agroecosystems. Soil Sci. Soc. Am. J. 75, 965–975.

Milne, E., Adamat, R.A., Batjes, N.H., Bernoux, M., Bhattacharyya, T., Cerri, C.C., Cerri,

Ballabio, C., 2009. Spatial prediction of soil properties in temperate mountain regions

C.E.P., Coleman, K., Easter, M., Falloon, P., 2007. National and sub-national assess-

using support vector regression. Geoderma 151, 338–350.

ments of soil organic carbon stocks and changes: the GEFSOC modelling system.

Batjes, N., Sombroek, W., 1997. Possibilities for carbon sequestration in tropical and

Agric. Ecosyst. Environ. 122, 3–12.

subtropical soils. Global Change Biol. 3, 161–173.

Mishra, U., Lal, R., Liu, D.S., Van Meirvenne, M., 2010. Predicting the spatial varia-

Benjamin, J.G., Mikha, M.A., Vigil, M.R., 2008. Organic carbon effects on soil physical

tion of the soil organic carbon pool at a regional scale. Soil Sci. Soc. Am. J. 74,

and hydraulic properties in a semiarid climate. Soil Sci. Soc. Am. J. 72, 1357–1362.

906–914.

Bi, S., 1997. Best laboratory of the universal research for the Earth’s global change

Mueller, T.G., Pierce, F.J., 2003. Soil carbon maps: enhancing spatial estimates

and earth system science – the Qinghai-Tibet plateau. Trans. Chin. Soc. Agric.

with simple terrain attributes at multiple scales. Soil Sci. Soc. Am. J. 67,

Eng. 17, 72–77 (In Chinese).

258–267.

Broomhead, D.S., Lowe, D., 1988. Multivariable functional interpolation and adaptive

Odeh, I.O.A., McBratney, A.B., Chittleborough, D.J., 1995. Further results on prediction

networks. Complex Syst. 2, 321–355.

of soil properties from terrain attributes: heterotopic cokriging and regression-

Cambardella, C., Moorman, T., Novak, J., Parkin, T., Karlen, D., Turco, R., Konopka, A.,

kriging. Geoderma 67, 215–226.

1994. Field-scale variability of soil properties in central Iowa soils. Soil Sci. Soc.

Özesmi, S.L., Tan, C.O., Özesmi, U., 2006. Methodological issues in building, training,

Am. J. 58, 1501–1510.

and testing artificial neural networks in ecological applications. Ecol. Mod. 195,

CAS (Chinese Academy of Sciences), Scientific Working Group on the Qinghai-Tibet

83–93.

Plateau, 1982. Physical Geography of Tibet. Science Press, Beijing (In Chinese).

Pei, T., Qin, C.Z., Zhu, A.X., Yang, L., Luo, M., Li, B., Zhou, C., 2010. Mapping soil organic

Chai, X.R., Shen, C.Y., Yuan, X.Y., Huang, Y.F., 2008. Spatial prediction of soil organic

matter using the topographic wetness index: a comparative study based on dif-

matter in the presence of different external trends with REML-EBLUP. Geoderma

ferent flow-direction algorithms and kriging methods. Ecol. Indic. 10, 610–619.

148, 159–166.

Piccini, C., Marchetti, A., Francaviglia, R., 2014. Estimation of soil organic matter by

Chan, Z.S.H., Ngan, H.W., Rad, A.B., David, A.K., Kasabov, N., 2006. Short-term ANN

geostatistical methods: use of auxiliary information in agricultural and environ-

load forecasting from limited data using generalization learning strategies. Neu-

mental assessment. Ecol. Indic. 36, 301–314.

rocomputing 70, 409–419.

Read, J.W., Ridgell, R.H., 1922. On the use of the conventional carbon factor in esti-

Dai, F.Q., Su, Z.A., Liu, S.Z., Liu, G.C., 2011. Temporal variation of soil organic matter

mating soil organic matter. Soil Sci. 13, 1–6.

content and potential determinants in Tibet, China. Catena 85, 288–294.

Rogers, L.L., Dowla, F.U., 1994. Optimization of remediation using arti-

Frogbrook, Z.L., Oliver, M.A., 2001. Comparing the spatial predictions of soil organic

ficial neural networks with parallel solute transport modeling. Water Resour.

matter determined by two laboratory methods. Soil Use Manage. 17, 235–244.

Res. 30, 457–481.

Gautam, R., Panigrahi, S., Franzen, D., Sims, A., 2011. Residual soil nitrate prediction

Sanchez, P.A., Ahamed, S., Carré, F., Hartemink, A.E., Hempel, J., Huising, J., Lagacherie,

from imagery and non-imagery information using neural network technique.

P., McBratney, A.B., McKenzie, N.J., Mendonc¸ a-Santos, M.d.L., Minasny, B., Mon-

Biosyst. Eng. 110, 20–28.

tanarella, L., Okoth, P., Palm, C.A., Sachs, J.D., Shepherd, K.D., Vågen, T.G.,

Goovaerts, P., 1997. Geostatistics for Natural Resources Evaluation. Oxford Univer-

Vanlauwe, B., Walsh, M.G., Winowiecki, L.A., Zhang, G.L., 2009. Digital soil map

sity Press, Oxford.

of the world. Science 325, 680–681.

Guo, P.T., Wu, W., Sheng, Q.K., Li, M.F., Liu, H.B., Wang, Z.Y., 2013. Prediction of soil

Schloeder, C.A., Zimmerman, N.E., Jacobs, M.J., 2001. Comparison of methods for

organic matter using artificial neural network and topographic indicators in hilly

interpolating soil properties using limited data. Soil Sci. Soc. Am. J. 65, 470–479.

areas. Nutr. Cycl. Agroecosyst. 95, 333–344.

Schmitz, G.H., Puhlmann, H., Droge, W., Lennartz, F., 2005. Artificial neural networks

Gupta, M.M., Jin, L., Homma, N., 2003. Static and Dynamic Neural Networks: From

for estimating soil hydraulic parameters from dynamic flow experiments. Eur.

Fundamentals to Advanced Theory. Wiley Interscience, New York.

J. Soil Sci. 56, 19–30.

Haykin, S., 1994. Neural Networks: A Comprehensive Foundation. IEEE Press, New

York. Scull, P., Franklin, J., Chadwick, O.A., McArthur, D., 2003. Predictive soil mapping: a

review. Prog. Phys. Geogr. 27, 171–197.

Hengl, T., Heuvelink, G.B.M., Stein, A., 2004. A generic framework for spatial predic-

Shen, Z.Q., Shi, J.B., Wang, K., Kong, F.S., Bailey, J.S., 2004. Neural network ensemble

tion of soil variables based on regression kriging. Geoderma 120, 75–93.

residual kriging application for spatial variability of soil properties. Pedosphere

Isaaks, E.H., Shrivastava, R.M., 1989. An Introduction to Applied Geostatistics. Oxford

14, 289–296.

University Press, Oxford.

Simbahan, G.C., Dobermann, A., Goovaerts, P., Ping, J., Haddix, M.L., 2006. Fine-

Ji, W.J., Li, X., Li, C.X., Zhou, Y., Shi, Z., 2012. Using different data mining algorithms

resolution mapping of soil organic carbon based on multivariate secondary data.

to predict soil organic matter based on visible-near infrared spectroscopy. Spec-

Geoderma 132, 471–489.

trosc. Spectr. Anal. 32, 2393–2398.

Smith, P., 2004. How long before a change in soil organic carbon can be detected?

Karunaratne, S.B., Bishop, T.F.A., Baldock, J.A., Odeh, I.O.A., 2014. Catchment scale

Global Change Biol. 10, 1878–1883.

mapping of measurable soil organic carbon fractions. Geoderma 219–220,

14–23. Stevens, A., Udelhoven, T., Denis, A., Tychon, B., Lioy, R., Hoffmann, L., van Wese-

mael, B., 2010. Measuring soil organic carbon in croplands at regional scale using

Lacoste, M., Minasny, B., McBratney, A., Michot, D., Viaud, V., Walter, C., 2014. High

airborne imaging spectroscopy. Geoderma 158, 32–45.

resolution 3D mapping of soil organic carbon in a heterogeneous agricultural

Subburayalu, S.K., Slater, B.K., 2013. Soil series mapping by knowledge discovery

landscape. Geoderma 213, 296–311.

from an Ohio County soil map. Soil Sci. Soc. Am. J. 77, 1254–1268.

Li, Y.S., Zhang, R.H., Wang, G.X., Zhao, L., Ding, Y.J., Wang, Y.B., 2009. Spatial vari-

Sun, Y., Kang, S.Z., Li, F.S., Zhang, L., 2009. Comparison of interpolation methods for

ability characteristics of soil organic carbon and nitrogen reveal typical alpine

depth to groundwater and its temporal and spatial variations in the Minqin oasis

meadow degradation on Qinghai-Tibetan Plateau. Environ. Sci. 30, 1826–1831

of northwest China. Environ. Model. Softw. 24, 1163–1170. (In Chinese).

194 F. Dai et al. / Ecological Indicators 45 (2014) 184–194

Takata, Y., Funakawa, S., Akshalov, K., Ishida, N., Kosaki, T., 2007. Spatial prediction of Yang, Y., Fang, J., Ji, C., Ma, W., Su, S., Tang, Z., 2010. Soil inorganic carbon stock in

soil organic matter in northern Kazakhstan based on topographic and vegetation the Tibetan alpine grasslands. Global Biogeochem. Cy. 24, GB4022.

information. Soil Sci. Plant Nutr. 53, 289–299. Zehetner, F., Miller, W.P., 2006. Soil variations along a climatic gradient in an Andean

Wang, S.Q., Tian, H.Q., Liu, J.Y., Pan, S.F., 2003. Pattern and change of soil organic agro-ecosystem. Geoderma 137, 126–134.

carbon storage in China: 1960–1980s. Tellus B 55, 416–427. Zhang, Z.X., Kushwaha, R.L., 1999. Applications of neural networks to simulate soil-

Webster, R., 1985. Quantitative spatial analysis of soil in the field. Adv. Soil Sci. 3, tool interaction and soil behavior. Can. Agric. Eng. 41, 119–125.

1–70. Zhang, S., Huang, Y., Shen, C., Ye, H., Du, Y., 2012. Spatial prediction of soil organic

Webster, R., Oliver, M., 2001. Geostatistics for Environmental Scientists. John Wiley matter using terrain indices and categorical variables as auxiliary information.

& Sons, Chichester, pp. UL–UL9. Geoderma 171–172, 35–43.

Wiesmeier, M., Barthold, F., Blank, B., Kogel-Knabner, I., 2011. Digital mapping of Zhao, M.S., Rossiter, D.G., Li, D.C., Zhao, Y.G., Liu, F., Zhang, G.L., 2014. Mapping soil

soil organic matter stocks using Random Forest modeling in a semi-arid steppe organic matter in low-relief areas based on land surface diurnal temperature

ecosystem. Plant Soil 340, 7–24. difference and a vegetation index. Ecol. Indic. 39, 120–133.

Worsham, L., Markewitz, D., Nibbelink, N., 2010. Incorporating spatial dependence Zhao, Z.Y., Chow, T.L., Rees, H.W., Yang, Q., Xing, Z.S., Meng, F.R., 2009. Predict soil

into estimates of soil carbon contents under different land covers. Soil Sci. Soc. texture distributions using an artificial neural network model. Comput. Electron.

Am. J. 74, 635–646. Agric. 65, 36–48.

Wu, C.F., Wu, J.P., Luo, Y.M., Zhang, L.M., DeGloria, S.D., 2009. Spatial prediction of Zhao, Z.Y., Yang, Q., Benoy, G., Chow, T.L., Xing, Z.S., Rees, H.W., Meng, F.R., 2010.

soil organic matter content using cokriging with remotely sensed data. Soil Sci. Using artificial neural network models to produce soil organic carbon content

Soc. Am. J. 73, 1202–1208. distribution maps across landscapes. Can. J. Soil Sci. 90, 75–87.