Accounting for Taxonomic Distance in Accuracy Assessment of Soil Class Predictions

Accounting for Taxonomic Distance in Accuracy Assessment of Soil Class Predictions

Geoderma 292 (2017) 118–127 Contents lists available at ScienceDirect Geoderma journal homepage: www.elsevier.com/locate/geoderma Accounting for taxonomic distance in accuracy assessment of soil class predictions David G. Rossitera, c, d,*, Rong Zenga, b, Gan-Lin Zhang a, b aState Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, PR China bUniversity of the Chinese Academy of Sciences, Beijing100049, PR China cISRIC World Soil Information, PO Box 353, Wageningen 6700 AJ, The Netherlands dSection of Soil & Crop Sciences, Cornell University, 242 Emerson Hall, Ithaca, NY14850, USA ARTICLE INFO ABSTRACT Article history: Evaluating the accuracy of allocation to classes in monothetic hierarchical soil classification systems, includ- Received 3 September 2016 ing the World Reference Base for Soil Classification, US Soil Taxonomy, and Chinese Soil Taxonomy, is Received in revised form 2 January 2017 poorly-served by binomial methods (correct/incorrect allocation per evaluation observation), since some Accepted 9 January 2017 errors are more serious than others in terms of soil properties, map use, pedogenesis, and ease of mapping. Available online xxxx Instead, evaluations should account for the taxonomic distance between classes, expressed as class sim- ilarities, giving partial credit to some incorrect allocations. These can then be used in weighted accuracy Keywords: measures, either direct measures of agreement or measures that account for chance agreement, such as the Soil class maps tau index. Similarities can be determined in one of four ways: (1) by the expert opinion of a soil classification Accuracy assessment Map evaluation specialist; (2) by the distance between classes in a numerical taxonomy assessment; (3) by distance within a taxonomic hierarchy; or (4) by an error loss function. Expert opinion can be from the point of view of the map user, to assess map utility, or map producer, to assess mapping skill. Examples are given of determining similarity between a subset of Chinese Soil Taxonomy classes by expert opinion and by numerical taxon- omy from soil spectra, and then using these for weighted accuracy assessment. A method for assessing the accuracy of probabilistic predictions of several classes at a location is also proposed. © 2017 Elsevier B.V. All rights reserved. 1. Introduction Any predictive method, for example digital (McBratney et al., 2003)orpredictive(Scull et al., 2003) soil mapping (DSM, PSM) that Soil classes are information carriers that present a holistic view allocates individuals to single monothetic classes is sure to produce of groups of soil individuals with a definite “personality”. They have incorrect allocations with respect to the true state of nature. The proven useful as the units of soil survey interpretation and to explain producer of such a prediction is obliged to evaluate and report its soil geography, for example using the World Reference Base for Soil accuracy, so that the map user can determine whether the map is Classification (WRB) (Driessen et al., 2001; FAO, 2014; Van Wambeke fit for an intended use, and so that the skill of the producer may be and Nachtergaele, 2003), US Soil Taxonomy (ST) (Buol et al., 2011; assessed. There is an extensive literature on measures of classifica- Soil Survey Staff, 2014), or Chinese Soil Taxonomy (CST) (Cooperative tion accuracy (e.g., Stehman and Czaplewski, 1998). These all depend Research Group on Chinese Soil Taxonomy, 2001; Gerasimova, 2010). on a set of evaluation observations (often called validation observa- These are all monothetic hierarchical classification systems, where tions, but see Oreskes et al. (1994) for the use of ‘evaluation’ in place individuals are allocated to single classes according to sharp thresh- of ‘validation’): the true state of nature is compared to the prediction. olds, so that all members of any class share a certain set of features In the case of allocation to soil classes, the “true” state is presumed that are not present in any members of other classes (Sneath and to be the allocation assigned by the professional classifier. These Sokal, 1973). This allows the construction of hierarchical keys by observations are usually presented as a cross-classification matrix which a soil individual is allocated to exactly one class. (also called a confusion matrix): rows represent classes as predicted, columns the reference class as presumed correct, and cell entries are the number of evaluation observations for each combination. Most evaluations of classification accuracy consider all misal- * Corresponding author. E-mail addresses: [email protected], [email protected] (D. Rossiter), locations, represented by the off-diagonal entries of the cross- [email protected] (R. Zeng), [email protected] (G. Zhang). classification matrix, to be equally serious errors. However, when http://dx.doi.org/10.1016/j.geoderma.2017.01.012 0016-7061/© 2017 Elsevier B.V. All rights reserved. D. Rossiter et al. / Geoderma 292 (2017) 118–127 119 the predictions are of soil classes, it is obvious that not all errors are use taxonomic distance to evaluate mapping accuracy. The work of equally serious. In the first place, professional classifiers often dis- Dobos et al. (2014) was by their own conclusion only a preliminary agree on the correct allocation in doubtful, and sometimes not so proof of concept; the work of van Beek (2014) is explained in §3.2, doubtful, cases, as anyone who has been on a soil correlation exer- below. cise can attest. Certain diagnostic features are subjective, despite the Therefore the objectives of this paper are to: (1) explain how efforts of the creators of the keys to make them as objective as possi- the conventional classification accuracy assessment method using ble. For example, the Soil Taxonomy definition of a fragipan requires, the cross-classification matrix can be adjusted to account for tax- in part that “the layer shows evidence of pedogenesis within the onomic distance; (2) present and discuss different methods for horizon, or at a minimum, on the faces of structural units” (Soil computing taxonomic distance to be used in accuracy assessment. Survey Staff, 2014, p. 13). This clearly relies on the classifier’s opinion We do not explore the very important issue of how the observa- of what constitutes “evidence of pedogenesis”. tions used to build the cross-classification matrix were obtained, Second, the selection of the soil individual, typically a soil profile, i.e., the sampling plan for map evaluation (e.g., Brus et al., 2011; for evaluation is subjective, and moving a few meters one way or the Stehman and Czaplewski, 1998). Our objective here is simply other can easily change the classification. For example, in the WRB to see how adjustments to the conventional accuracy measures a mollic horizon, not directly over certain indurated materials, must can be made, taking into account taxonomic distance between have a thickness of 20 cm or more. It is easy to find natural soil bod- classes. ies shown as delineations on an area-class map where the thickness of a horizon that would otherwise qualify as mollic ranges from 18 to 22 cm. Thus the evaluation observation with a calcic horizon and 2. The ordinary and weighted cross-classification matrices a dark, organic-rich, high-base saturation epipedon would be classi- fied as a Kastanozem if the mollic horizon is recognized, but a Calcisol The cross-classification matrix X is the fundamental data struc- otherwise. Clearly any misallocation by the predictive method can ture in accuracy assessment (Congalton and Green, 1999). It is con- not be expected to do better than the field scientist selecting a sup- structed as follows. Suppose that we have n soil profiles and that they posedly representative individual. An extreme case was reported by have been allocated to r classes, which we denote i, i =1,2,..., r. Edmonds and Lentner (1986), who found profiles allocated to four We set up a square asymmetric r × r matrix X, each row and each Soil Taxonomy soil orders within 7 m of each other in a steep hillside column of which corresponds to one class, in the same order. In map unit in the Ridge and Valley province in Virginia (USA), all with each cell (i, j) we enter the number of profiles which actually are of similar use potential and pedogenesis. class j that have been predicted to belong to class i. Thus the diag- Third, and most importantly, the consequences of a misalloca- onals (1, 1), (2, 2) ...(r, r) represent agreement between predicted tion for someone using the prediction can differ widely depending and actual, and off-diagonals represent different misallocations. The on the relation between the allocated class and the actual class. For asymmetry arises because there is no reason to expect that a mis- example, in the USA if a map delineation is allocated to a soil series allocation of a profile actually of class j to class i will happen as which is considered a so-called hydric soil, a large number of restric- frequently as the inverse misallocation. From this matrix we com- tions are placed on the land use (Vepraskas, 2015), so predicting a pute row sums xi+, i.e., the total number allocated to class i,andthe hydric soil where the soil is in fact not hydric imposes a large bur- column sums x+j, i.e., the total actually in class j. The row-wise pro- den on the land user. The reverse error, i.e., predicting a non-hydric portion of correct allocations Ci = xii/xi+ is commonly referred to soil where the soil is in fact hydric, can lead to severe environmen- as the user’s accuracy for class i; in the case of map accuracy assess- tal problems such as agricultural chemicals in surface waters.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us