Geoderma 368 (2020) 114287

Contents lists available at ScienceDirect

Geoderma

journal homepage: www.elsevier.com/locate/geoderma

Machine learning digital mapping to inform gully erosion mitigation measures in the Eastern Cape, South T ⁎ Casper du Plessisa, George van Zijlb, , Johan Van Tolc, Alen Manyeverea a Department of Agronomy, University of Fort Hare, Alice 5700, South Africa b Unit for Environmental Sciences and Management, North-West University, Potchefstroom, South Africa c University of the Free State, Nelson Mandela Avenue, 9301 Bloemfontein, South Africa

ARTICLE INFO ABSTRACT

Handling Editor: Jan Willem Van Groenigen is probably the most common form of land degradation, leading to on and off site detrimental ff Keywords: e ects, such as an increase in river sediment load which have been shown to drastically decrease dam storage Cubist capacity. The South African Government has announced plans to build two large dams within an area with a very Gully erosion high erosion rate, thus necessitating soil erosion mitigation efforts. These efforts should consider the mechanism Multinomial logistic regression of gully formation, as it determines which conservation measures could be used. With overland flow gullies the Planosols kinetic energy of flowing water needs to be curbed, while gullies form by piping in dispersive where free Sediment water could accumulate. This paper describes how a soil association map for the quarternary catchments T35DE was created with methods, in order to indicate areas where piping could occur. Soil were described and classified at 600 locations determined with the conditioned Latin hypercube sampling method, within a 500 m buffer around the available road network. This soil database was then used with the multinomial logistic regression algorithm to create an initial soil association map with seven soil associations, based on their soil erosion properties. Additionally, a soil depth map was created using the cubist algorithm. The soil asso- ciation map was then created again, but with the soil depth map added as an additional covariate. Including the soil depth map as a covariate improved the accuracy (68% vs 64%) and level of detail of the final soil association map, even though the soil depth map was a poor reflection of reality. The final soil association map was compared to the best current available spatial soil dataset, and found to contain 123 times more detail. Superimposing the gully extent of the soil association map confirmed that the planosols are the most susceptible to gully erosion and mitigation efforts should avoid increasing free water in the on the planosol area, and the as yet ungullied planosol should be prioritized as it could become a hotspot for gully formation if it degrades.

1. Introduction sediments and distribution thereof to rivers and dams (Van Tol et al., 2014). A dramatic example of the detrimental impact of sediments is Soil erosion is likely the most common form of land degradation the Mount Fletcher Bulk Water Supply Scheme in the upper Mzimvubu worldwide (Bridges and Oldeman, 1999). In South Africa, large areas River, which lost 70% of its storage capacity due to sedimentation (85% of land) is vulnerable to soil erosion (Van Rensburg, 2008), with within 4 years after completion of the project (Zunckel, 2013). more than 70% of the country already affected by varying intensities of The South African Government recently announced plans to launch soil erosion (Garland et al., 2000). The estimated soil erosion rates the Mzimvubu Water Project (MWP) in the Eastern Cape Province, in an average at more than 3 t/ha/year (Scotney and Mcphee, 1990). The area similar to the Mount Fletcher Bulk Water Supply Scheme. The Eastern Cape Province, particularly in the communal lands, is severely MWP will involve the building of two large storage dams (Ntabelanga affected by soil erosion (Rowntree et al., 2008), where average annual and Laleni) in the Tsitsa River, one of the largest tributaries to the soil loss rates exceed 12 t/ha/year (Le Roux et al., 2007). Soil erosion Mzimvubu River. The Mzimvubu River is the largest undeveloped river has economic, social and environmental implications due to both on- in South Africa, this is despite its high annual runoff, high environ- site and off-site effects. It not only involves the loss of fertile topsoil and mental status (NFEPA Report, 2011), high tourism potential, and its reduction of soil productivity but is also largely connected with pro- suitability for afforestation and moderate suitability for dryland/ blematic impacts on another location, such as increased load of rainfed and irrigated (NWRS2, 2012). The MWP is supposed

⁎ Corresponding author. E-mail address: [email protected] (G. van Zijl). https://doi.org/10.1016/j.geoderma.2020.114287 Received 7 August 2019; Received in revised form 18 February 2020; Accepted 21 February 2020 Available online 28 February 2020 0016-7061/ © 2020 Elsevier B.V. All rights reserved. C. du Plessis, et al. Geoderma 368 (2020) 114287

Fig. 1. The location of the study site catchments T35DE in the Eastern Cape, South Africa. The road network, training and evaluation observations, gullies (from Le Roux, 2018) and the proposed Ntabelanga dam footprint are also shown. to rejuvenate agriculture and serve as catalyst for economic and social type survey (Land Type Survey Staff, 1972–2006), a South African re- development (DWS, 2014). The vast sediment loads in the Tsitsa River source similar to the SOTER database. The land types are described by poses severe questions regarding the sustainability of the planned van Zijl et al. (2013b) as a “1:250 000 scale map, showing relative MWP, and adequate rehabilitation of eroded land prior to the project homogeneous parcels of land in terms of terrain, climate and soil distribution cannot be over-emphasised (Le Roux, 2018). pattern. Each such parcel of land is a different land type. Soil distribution is To curb soil erosion and sediment yield, the processes by which it given as a rough estimate of the percentage of area of each terrain mor- occurs must be known, in order to manage it. Within the study site, it phological unit (TMU; also called terrain unit) covered by a specific soil was determined that the sediment yield derived from gullies are be- series.” Therefore, there is a level of soil spatial distribution embedded tween 300 and 23 700 Mg/km/y, while that derived from rill and in- in each land type polygon via the percentage estimation of terril erosion lies between 5 and 100 Mg/km/y, for the time period the different terrain units. Individual land types will typically have a 2007–2012 (Le Roux, 2018). Therefore, conservation efforts should name given as a code with to letters (for example Db), followed by a focus on stabilizing existing gullies and preventing the formation of number. Land types are also grouped together into broad land types new gullies. Additionally, it is thought that many of the current gullies (coded only using the letters), which is the grouping of land types in the area were created through pipe collapse (DWS, 2014). Piping where similar soil series occupy similar percentages of area, irrespec- occurs where a dispersive soil is present, free water accumulates in the tive of terrain position distribution. subsoil and there is an outlet for the free water (Fletcher and Carroll, Recently digital soil mapping (DSM) approaches have been in- 1948; Thornes, 1980). Subsequently, in areas with dispersive soils, soil creasingly used to provide cost effective and timeous improved soil erosion mitigation measures should prevent excess water from in- spatial distribution information for specific functions in southern filtrating the soil, as this could lead to dispersion and therefore piping. Africa, including forestry production (Van Zijl et al., 2014a), hydro- This is in contrast to mitigation measures for overland flow gullies, as in (Van Zijl and Le Roux, 2014; Van Zijl et al., 2019) and soil these areas the kinetic energy of the overland flow should be mini- erosion monitoring (Van Tol et al., 2014). A summary of DSM projects mized, often with structures such as contours, which commonly in- in southern Africa is given in Van Zijl (2019). In this project, a DSM creases water infiltration. This importance of this distinction is ex- machine learning approach was used to map the soil associations of the emplified by Faber and Imeson (1982) who classified gullies in Lesotho, area to enable decision makers to implement gully mechanism specific a geomorphological area similar to the study site, according to their mitigation measures throughout the site. The value of the DSM created mechanism of formation. Also in Lesotho, van Zijl et al. (2013a) showed is determined by comparing it to the best currently available that the soil spatial distribution was the controlling factor of gully spatial soil information, the Land Type database, and superimposing the erosion, with specifically the planosols being susceptible to gully ero- gullies mapped from the area over the soil map, to determine whether sion. Planosols (often referred to as duplex soils in southern Africa) are the soil distribution influences the gully distribution. characterized by an abrupt textural change from a coarse textured fi topsoil to a nely textured subsoil (Rooyani, 1985), and are often prone 2. Site description to piping (Chakela, 1981). The catchment area of the Tsitsa River ff contains large areas with planosols (Land Type Survey Sta , The study site is the quaternary catchments (Schulze, 2007) T35 D – 1972 2006), and approximately 71% of the soils surrounding the pro- and E in the Eastern Cape Province of South Africa and is located about posed Ntabelanga dam are highly sensitive to erosion (Van Tol et al, 626 km south of Pretoria (Fig. 1). The town of Maclear is situated to the 2014). However, the spatial distribution of the soils in the catchment west of the site, while the Ntabelanga dam footprint occurs within the are unknown at a scale of larger than 1: 250 000, which is too low to south east of the site. The approximately 83 000 ha area has a centre fi inform management decisions at eld scale. point at 31° 7′ 35.9″ S and 28° 40′ 30.6″ E. The sub-escarpment The best available soil spatial dataset for the entire area is the land and sub-escarpment Savanna (Mucina & Rutherford 2006)

2 C. du Plessis, et al. Geoderma 368 (2020) 114287 are the main bioregions within the site. The site lies in the summer rainfall area, with a mean annual precipitation of 749 mm, of which the lowest mean monthly precipitation occurs in June with only 15 mm, and the highest occurring in January (108 mm) (Schulze, 2007). The 12 6 4 28 56 area is underlain by sedimentary rocks of the Tarkastad subgroup and 16 11 Beaufort Karoo supergroup with post Karoo dolerite intrusions. There are also traces of flake conglomerates (Council for Geosciences, 2007). The area is characterized by highly unstable soils that are prone 36 Training Obs Evaluation Obs 18 13 83 168 to erosion as evidenced by extensive areas of severe gully erosion on the 50 30 inter-fluvial areas in alliance with stream channels. Catchment T35E has the highest measured erosion of all catchments comprising the Tsitsa river catchment (Le Roux, 2018). The erosional and piping characteristics in this area are suggestive of the presence of dispersive soils (DWS, 2014). Within catchment T35DE 39 different land types occur in 50 different polygons. These land types are grouped into seven different broad land types, based on the dominant soil types found within the land type.

3. Material and methods

Covariate data collected included the Landsat 8 satellite imagery for 2016–12-11 (USGS, 2016), the SRTM 30 m DEM (USGS, 2016), climatic data from Schulze (2007) and the 1: 250 000 vector geological (Council for Geosciences, 2007) and land type maps (Land Type Survey Staff, 1972–2006). The vector covariate layers were rasterized and all layers resampled to fit on the 30 m resolution grid of the Landsat images. From the Landsat Images, the Normalized Difference Vegetation Index (NDVI; Rousse et al., 1973) and other band ratios were calculated, while various terrain derivatives were calculated from the DEM, in- cluding, but not limited to slope gradient, profile curvature, planform curvature, topographic wetness index (TWI) and multi-resolution index of valley bottom flatness (MRVBF). SAGA-GIS was used for all GIS re- lated calculations and derivatives. Six hundred soil observation positions were determined with the Dark with high base status topsoil Young soil formed in unconsolidated sediment Marked clay enrichment from the topsoil to the subsoil, with moderate blocky structure in Shallow soil on hard and weathered rock Marked clay enrichment from topsoil to subsoil, with abrupt transition and prismatic Uniform red and yellowbrown colours in subsoil conditioned Latin hypercube sampling method (cLHS, Minasny and subsoil structure in subsoil McBratney, 2006). According to McBratney et al. (2003), it is advisable that all the SCORPAN factors (S = soil, C = climate, O = organisms, R = relief, P = , A = age and N = spatial factor) commonly used in DSM are represented in mapping projects. To ac- commodate this, without including too many covariates into the cLHS, principal component analysis (PCA) was used to determine the prin- cipal component of each SCORPAN factor, (except for time and spatial, for which there were none). Before performing the PCA, each covariate , , Reduction in the shallow subsoil Reference Groups (IUSS, 2015) Distinctive Characteristic , Luvisols Leptosols Planosols layer representing a SCORPAN factor was standardized to a scale of 0 – 100. The soil was represented by the Landsat 8 band 4, climate by mean precipitation and average temperature, organisms by the NDVI and Landsat 8 bands 5 and 6, relief by slope, profile curvature, planform curvature, TWI, altitude above channel network (AACN) and MRVBF. Lastly, the geological map represented parent material. To save time in the field, the observation positions were restricted to a buff er of 500 m around the available road network. To assess the representativeness of the road buffer area, the percentage of pixels of the entire site re- presented by the value range of the covariate layers used in the cLHS for the road network buffer was determined. Soil observations were made by soil auger to a depth of 1200 mm or refusal at the cLHS determined positions. The soils were classified ac- en, Hutton, Shortlands, Pinedene, Avalon, cording to the Soil Classification Working Group (1991) and the World ff Reference Base (IUSS, 2015). Furthermore, soil profiles were described per , with feel test texture, structure, colour, mottles, pre- sence of lime, stone percentage, and depth being noted. Observations Bainsvlei could not be made at nine of the cLHS locations, as it fell into a re- sidential area. An additional 56 spot observations were made from gully walls and road cuttings as the surveyor moved through the area. These observations were only classified, but not described. The soil forms were then divided into soil erosion related soil as- CambisolsLuvisols Oakleaf,Planosols Tukulu Sepane, Valsrivier GleysolsLeptosols Sterkspruit PhaeozemsAcrisols Katspruit, Westleigh, Dundee, Mispah, Glenrosa Bonheim, Milkwood Clovelly, Gri Soil Association Soil Forms (SCWG, 1991)

sociations (Table 1). It was noticed that many soil observations Table 1 The mapped soil associations and their distinctive characteristics.

3 C. du Plessis, et al. Geoderma 368 (2020) 114287 occurred within stream beds, due to erosion cutting down to bedrock. Table 2 Therefore, all leptosol observations within 50 m from a stream were The value range of the covariate layers used in the cLHS (conditioned Latin omitted from the database. Furthermore, a few local wetlands occurred Hypercube Sampler) determination for the 500 m road network buffer area, as on ridge crests. As these wetlands are seen to be anomalies within the well as the percentage of pixels of the entire site represented by this range. landscape, all soil association observations occurring on ridge Covariate Min value Max value Percentage pixels crests were also omitted from the database. covered by road buffer For mapping purposes, the soil observation dataset was divided into fi Landsat 8 band 4 6010 9916 96% a training and independent evaluation dataset, using a strati ed Landsat 8 band 5 6934 13,732 96% random sampling design, with soil association as stratifier. The multi- Landsat 8 band 6 6763 15,660 93% nomial logistic regression (MNLR; Kempen et al., 2009) was used to Landsat 8 band 8 6249 9186 95% derive a first soil association map, using Landsat bands 3, 4, 5, 6, 7, 3/1 NDVI 0.031 0.268 97% 4/2, 5/7, 5/4, the NDVI, median annual precipitation, average annual Valley depth 0.22 204 95% MRVBF 0 3.00 100% temperature, altitude above sea level, MRVBF, multi resolution of ridge TWI 1.46 14.6 98% top flatness (MRRTF), planform curvature, profile curvature, slope Vertical depth to channel 0 150 92% percentage, flow accumulation, the universal soil loss equation LS network factor, relative slope position, TWI, AACN and the broad land types as Relative slope position 0 0.99 97% Slope percentage 1.3 31.2 95% covariates. Thereafter, the cubist algorithm (based on the M5 algorithm Profile curvature −4.95 4.50 96% of Quinlan, 1992) was used to create a soil depth map using the same Planform curvature −3.78 3.87 96% covariate layers. The created depth map was then added as a covariate Flow accumulation 905 5,301,870 98% layer and a new soil association map was created with the MNLR al- Altitude 921 1550 96% gorithm, using the same covariates as before, as well as the cubist Temperature 15.2 17.3 99% Mean precipitation 631 961 98% created soil depth map. The soil association maps were evaluated by determining whether the evaluation dataset points were mapped correctly. A one-pixel buffer was used as in van Zijl et al. (2012). These maps were evaluated based on the calculated total evaluation point accuracy, user’s and producer’s accuracy and the Kappa coefficient. Total evaluation point accuracy is the percentage of the evaluation dataset’s points correctly mapped. The user’s accuracy represents the percentage of evaluation observations correctly mapped within a specific map unit, and represents the map unit accuracy from the user of the map’s perspective. In contrast to this, the producer’s accuracy represents the accuracy of the map unit from the producer’s perspective, by expressing the number of correctly mapped observations of a specific association as a percentage of the total amount of observations within the evaluation dataset of that as- sociation. The Kappa coefficient indicates how well the map reflects reality as opposed to the random designation of mapping units to dif- ferent areas. Kappa coefficient values close to 0 indicates the map ac- curacy is equal to a random designation of soil classes, while a value close to 1 indicates that the map represents reality well. The soil depth mas was evaluated by determining the R2 and mean square error (RMSE) values of the soil depth. Furthermore, the value of the map was determined by comparing the DSM created soil map to the best available soil information of the area, which is the land type soil maps (Land Type Survey Staff, 1972–2006). Firstly, the level of the detail was evaluated by de- termining the number of polygons of each soil resource. Secondly, the Fig. 2. The predicted vs observed soil depth graph. The solid line shows the relativeness of the two datasets to gully erosion was compared by de- trend line, while the dotted line shows x = y. termining the gully density on each mapping unit. The gully layer as created by Le Roux (2018) was used. depths (Fig. 2) one can see that most of the observations had a depth of 1200 mm, the depth limit of the project, which skewed the RMSE value. 4. Results and discussion Almost all of the soils shallower than 1200 mm were predicted deeper than their observed depths. Furthermore, the trend line crosses the The 500 m buffer around the road network was a good re- x = y line, with a lower slope, which suggest that the depth map over presentation of the entire site (Table 2). This table shows the range of predicts the depth of soils shallower than 850 mm, while it under each covariate layer used in the cLHS, for the area within the 500 m predicts the depth of soils deeper than 850 mm. Therefore, despite the buffer around the road network, as well as the percentage of pixels small RMSE value, the soil depth map does not represent the reality within the entire site which are represented by the range of the road well. network pixels. All the covariate layers have at least a 92% re- The two soil association maps (as well as the soil depth map) are presentation by the road network buffer area, with 15 of the 17 layers given in Fig. 3. One can see that the soil association map which was having a representation percentage above 95%. made when including the soil depth map as a covariate includes more The predicted soil depth map (Fig. 3a) achieved a mixed evaluation. detail than the soil association map which was created without soil The RMSE is 30 mm, which is very accurate in terms of the soil depth depth as covariate (Fig. 4). The increased detail is largely due to the range (Hyndman and Athanasopoulos, 2018). However, the R2 value is leptosol association which corresponds to the shallower areas in the soil only 0.64, which is much lower than expected for such an accurate depth map. When evaluating the two soil association maps, the map RMSE. If one examines the trend of predicted vs actual soil profile

4 C. du Plessis, et al. Geoderma 368 (2020) 114287

associations generally have less observations than the soil association with higher accuracies ( and leptosol), indicate that the larger amount of observations in certain soil associations, allows for the al- gorithm to be skewed towards predicting these soil associations, to the detriment of the less represented soil associations. The exception to this is the gleysol soil association, which had few observations, but still obtained a relatively high producers accuracy. The reason for this is that the gleysol soil association, which basically represents wetlands, occupies a very specific landscape position and produces a very specific vegetation cover, which is easily modelled. Overall, the soil association map can be accepted, based on the evaluation dataset point accuracy of 68%, as well as the Kappa statistic value of 0.42, which indicates a moderate agreement with reality. The evaluation dataset point accuracy compares well with similar maps produced, such as the 60% point accuracy and Kappa of 0.59 of Van Zijl et al. (2019), the 69% point accuracy and Kappa of 0.55 (Van Zijl et al., 2012) and the 69% point accuracy of MacMillan et al. (2010). However, the Kappa values are lower than the literature mentioned. This is due to the large amount of acrisol observations, which makes up a large pro- portion of the total observations. When comparing the soil association map to the land type survey for this area, it is clear that the soil association map is a large improvement on the land type survey. The soil association map boasts 6168 polygons larger than 1 ha, which constitutes 123 times more detail than the 50 polygons of the land type survey. As mentioned before the land type survey does contain additional information for each map unit, in the form of an estimate of the percentage area soil forms occupy within each terrain type within the polygon boundary. Two previous studies using different methods have tried to disaggregate one of the land types, Db334, within this area, using the soil terrain estimation. Botha (2016) used a manual landform based approach to disaggregate land type Db334 into five soil associations divided on the same basis as the soil associations used in this study. Her disaggregated maps achieved a point accuracy of 37% and a Kappa statistic of 0.2. Flynn et al. (2019), disaggregated the same land type with the automated DSMART algo- rithm (Odgers et al., 2014) with roughly the same success (point ac- curacy of 33%). Both of these studies concluded that the complex geology which is not captured within the available geological spatial data sources are the reason for the poor results within this landscape. Nevertheless, the results of these studies show that the extra informa- tion contained within the land types for this site does not improve the accurate level of detail contained within the land type inventory. Superimposing the gully shapefile (Le Roux, 2018) for the area on the land type and soil association maps allows for the determination of the gully density for each map unit. Fig. 5 expresses the gully density for the different map units as a percentage of the total area of that map unit. The broad land types, which is a grouping of similar individual land types, are given for the sake of brevity. It is clear that the Db land types have the highest gully density of the broad land types, as 13% of the area is occupied by gullies. This figure is largely due to the 13.6% gully density on the Db334 land type, which is the single land type with Fig. 3. Created soil maps in this project. a) soil depth map up to 1200 mm. b) the highest gully density. The fact that the highest single gully density soil association map created with environmental covariate layers only. c) soil was found on a land type unit shows the accuracy of the land type asociation map created with environmental covariate and the soil depth layers. polygons in this area, despite the inaccuracy of the inventories due to the complex geology. without depth achieved an evaluation point database accuracy of 64% The Db334 land type gully density is higher than the highest gully and a Kappa statistic value of 0.38. When depth was included as a density on a map unit on the soil association map for the entire site, covariate value, the evaluation point accuracy climbed to 68% and the where the cambisols and planosols have gully densities of 11.6 and 11% Kappa statistic value to 0.42. Thus the increase in detail with the in- respectively. These high figures are expected, as planosols are notorious corporation of the soil depth map in the mapping process, lead to an for their erodibility in southern Africa (Van Zijl et al., 2014b) and the increase in detail and accuracy in the soil association map. genesis of the cambisols in this area, which is mostly from transported fi The confusion matrix for the nal soil association map (Table 3) material, leads to the cambisols forming in areas close to gullies due to ffi indicates that all the soil associations were su ciently mapped, with all the material brought in through the gullies. As the planosols occupy a ’ the user s accuracies above 50%. However, the , phaeozems, larger area than the cambisols (6.8% vs 2.2%; Fig. 6) and thus have a luvisol and planosol soil associations achieved less accurate results for larger portion of the total gullies occurring on it (28% vs 9.4%) it seems ’ the producer s accuracy, with percentages less than 50%. That these soil the unstable nature of the planosols are responsible for the large gully

5 C. du Plessis, et al. Geoderma 368 (2020) 114287

Fig. 4. The improved level of detail at a 1: 50 000 scale in the soil association map when including the soil depth map as a covariate (a) as opposed to the same area when the soil depth map is not included as a covariate (b). density of the area. The other soil associations all had gully densities increase evapotranspiration. Conservation measures which increase less than 5%, with the acrisols having the lowest gully density. water infiltration such as check dams, contours and sediment traps Within the Db334 land type, the planosols have the largest gully should be avoided. density within this site, at 22.9%, followed by the cambisols at 16.1%. 2. The current ungullied planosol areas should be conserved as a The other soil associations all have lower gully densities than the priority, as they could become gully hotspots if gullying initiates. average within land type Db334. Nearly half of the gullies within 3. Stabilizing existing gullies on the planosol area should be attempted Db334 occurs on planosols (43.3%), showing how influential the oc- by establishing vegetation, rather than building check dams, which currence of the planosols are in the formation of gullies. Thus, within would pond water. the larger land type units, the soil association map does provide a finer level of detail with regards to the occurrence of gullying, and the factors 5. Conclusions controlling the gullies. One can deduct from this information that the large gully density Using the MNLR algorithm it was possible to create a soil map for fi within the site and speci cally land type Db 334 is due to the instability the approximately 83 000 ha of the catchments T35DE. The final soil of the planosols. In areas such as land type Db334, where planosols map achieved an evaluation dataset point accuracy of 68%, together dominate, the instability of the planosols have led to formation of with a Kappa value of 0.42, indicating a moderate agreement with gullies throughout the landscape, including on the more stable soils reality. The created soil association map is an improvement on the such as the acrisols and leptosols. available soil spatial information, with 123 times more polygons larger Based on this study, the following suggestions are made for soil than one ha within the soil association map than the land type survey. ff conservation e orts: Adding a soil depth map created with the Cubist algorithm im- proved the soil association map in terms of level of detail and accuracy, 1. On the planosol area conservation methods which minimizes the even though the soil depth map was not necessarily a very good re- build-up of free water within the soil should be used, such as the presentation of reality. This shows the importance of including the soil establishment of vegetation and decrease of grazing which will factor as covariate layer when creating digital soil maps.

Table 3 Confusion matrix of the final soil association map.

Users Accuracy

Cambisol Luvisol Gleysol Leptosol Phaeozems Acrisol Planosol Total Correct %

Producers Accuracy Cambisol 1 1 1 2 1 6 1 17 Luvisol 6 1 2 6 1 16 6 38 Gleysol 7 1 4 12 7 58 Leptosol 20 7 1 28 20 71 Phaeozems 1 1 1 1 4 1 25 Acrisol 2 3 50 1 56 50 89 Planosol 3 3 5 11 5 45 Total 1 8 11 29 2 73 9 133 90 68 Correct 1 6 7 20 1 50 5 90 % 100 75 64 69 50 68 56 68

6 C. du Plessis, et al. Geoderma 368 (2020) 114287

Fig. 5. Gully density percentage per mapping unit of different soil information sources. The codes Ad, Ae, Ac, Dc, Fa, Ab and Db are names for different broad land types and Db334 for the single land type with the highest gully density.

The cost benefit of limiting the cLHS determined observations to the cover 7% of the area. Within the land type with the highest gully 500 m around the available road network, was assessed to outweigh the density, Db334, the planosols are the most abundant soil association, slight decrease in representativeness of the potential observation points, covering 25.6% of the area. The instability of these soils increased the as the range of the covariate layer’s values used in the cLHS method, gully formation of the land type, with gullies spreading to the adjacent within the 500 m road buffer area, represented more than 92% of the soil forms. pixels of the entire site. The following soil conservation efforts are suggested: Superimposing the available gully layer on the soil association map allowed for deductions to be made on the stability of the soils. 1. On the planosol area conservation methods which minimizes the Cambisols and planosols are the most susceptible to having gullies form build up of free water within the soil should be used, such as the in them, and due to the planosols larger area, their distribution seems to establishment of vegetation and decrease of grazing which will in- play a controlling factor in gully distribution in this area, as more than a crease evapotranspiration. Conservation measures which increase quarter of the site’s gullies formed on planosols, although they only water infiltration such as check dams, contours and sediment traps

Fig. 6. Percentage of total area and total gully area per soil association for the entire site and land type Db334 respectively.

7 C. du Plessis, et al. Geoderma 368 (2020) 114287

should be avoided. McBratney, A.B., Mendonca Santos, M.L., Minasny, B., 2003. On digital soil mapping. 2. The current ungullied planosol areas should be conserved as a Geoderma 117, 3–52. Minasny, B., McBratney, A.B., 2006. A conditioned Latin hypercube method for sampling priority, as they could become gully hotspots if gullying initiates. in the presence of ancillary information. Comput. Geosci. 32, 1378–1388. 3. Stabilizing existing gullies on the planosol area should be attempted Mucina, L., Rutherford, M.C. (Eds.), 2006. The vegetation of South Africa, Lesotho and by establishing vegetation. Swaziland, Strelitzia 19. South African National Biodiversity Institute, Pretoria. Report, N.F.E.P.A., 2011. National Freshwater Ecosystems Priority Areas. Department of Water Affairs, Pretoria. Declaration of Competing Interest NWRS2, 2012. Natural Water Research Strategy 2: Managing Water for an Equitable and Sustainable Future. Department of Water Affairs, Pretoria. ff The authors declare that they have no known competing financial Odgers, N.P., Sun, W., Mcbratney, A.B., Minasny, B., Cli ord, D., 2014. Disaggregating and harmonising soil map units through resampled classification trees. Geoderma interests or personal relationships that could have appeared to influ- 214–215, 91–100. https://doi.org/10.1016/j.geoderma.2013.09.024. ence the work reported in this paper. Quinlan, J., 1992. Learning with continuous classes. In: Adams, A., Sterling, L. (Eds.), Proceedings AI’92, 5th Australian Conference on Artificial Intelligence. World Scientific, Singapore, pp. 343–348. Acknowledgements Rooyani, F., 1985. A note on soil properties influencing piping at the contact zone be- tween albic and argillic horizons of certain duplex soils (Aqualfs) in Lesotho, southern – The authors would like to acknowledge funding from the Africa. 139, 517 522. ff Rousse, J.W., Haas, R.H., Schell, J.A., Deering, D. 1973. Monitoring the vernal ad- Department of Environmental A airs of South Africa under the vancement and retrogradation (green wave effect) of natural vegetation. Prog. Rep. National Resources Management programme. RSC 1978-1, Remote Sensing Center, Texas A&M Univ., College Station, 93p. (NTIS No. E73-106393). Rowntree, K.M., Foster, I.D.L., Mighall, T.M., Du Preez, L., Boardman, J., 2008. Post References European settlement impacts on erosion and land degradation: a case study using Farm Reservoir Sedimentation in the Eastern Cape, South Africa. In: International ‘ ’ Botha, C., 2016. Disaggregation of land type data to produce functional soil information. symposium on sediment dynamics in changing environments 1-5 December 2008, – MSc thesis. University of the Free State. Christchurch, New Zealand. IAHS publ., pp. 139 142. Bridges, E.M., Oldeman, L.R., 1999. Global assessment of human induced soil degrada- Scotney, D.M., Mcphee, P.J., 1990. The dilemma of our soil resources. Proceddings of the tion. Arid Soil Res. Reh. 13, 319–325. Veld Trust Conference on the Conservation Status of Agricultural Resources in RSA, Chakela, Q.K., 1981. Soil erosion and reservoir sedimentation in Lesotho. Uppsala Pretoria. (Proc. Paper pages not sequentially numbered.). University, Department of Physical Geography UNGI Rapport nr 54. ISBN: 91-7106- Schulze, R.E., 2007. South African Atlas of Climatology and Agrohydrology. Water re- 186-X. search Commission, Pretoria. WRC report 1489/1/06. fi fi Council for Geoscience, 2007. Geological data 1:250 000. Council for Geoscience, Soil Classi cation Working Group, 1991. Soil classi cation: A taxonomic system for South Pretoria, South Africa. Africa. Department of Agricultural Development, Pretoria. DWS, 2014. Mzimvubu Water Project: Ntabelanga Dam borrow pits and quarry Thornes, J.G., 1980. Erosional processes of running water. In: Kirkby, M.J., Morgan, Environmental Management Plan. DEA REF No. 14/12/16/3/3/2/677 (Dam con- R.P.C. (Eds.), Soil Erosion. Wiley, Chinchester. struction application). USGS, 2016. USGS, 2017. https://earthexplorer.usgs.gov/ Downloaded on 2016-12-11. Faber, T.H., Imeson, A.C., 1982. Gully hydrology and related soil properties in Lesotho. Van Rensburg, L.D., 2008. Advances in : Application in irrigation and dry land – In: Proceedings of the Exeter Symposium. IAHS Publication, pp. 135–140. crop production. South Afri. J. Plant and Soil 27, 9 18. Fletcher, J.E., Carroll, P.H., 1948. Some properties of soils associated with piping in van Tol, J.J., Akpan, W., Kanuka, G., Ngesi, S., Lange, D., 2014. Soil erosion and dam ‘fi ’ Southern Arizona. Proc. Soil Sci. Soc. Am. 13, 545–547. dividends: science facts and rural ction around the Ntabelanga dam, Eastern Cape, Flynn, T., van Zijl, G.M., Van Tol, J.J., Botha, C.C., Rozanov, A., Warr, B., Clarke, C., South Africa. South African Geographical J. https://doi.org/10.1080/03736245. 2019. Comparing algorithms to disaggregate complex soil polygons in contrasting 2014.977814. environments. Geoderma 352, 171–180. Van Zijl, G.M., Smith, H.J.C., Le Roux, P.A.L., Minasny, B., McBratney, A.B., Malone, B., Garland, G., Hoffman, T., Todd, S., 2000. Soil degradation. Chapter 6. In: Hoffman, T., 2012. Rapid soil mapping under restrictive conditions in Tete, MozambiqueRapid soil Todd, S., Ntshona, Z., Turner, S. (Eds.), Land degradation in South Africa. mapping under restrictive conditions in Tete, Mozambique. Digital soil assessments Department of Environmental Affairs and Tourism, Pretoria. and beyond. CRC Press, Boca Raton. Hyndman, R.J., Athanasopoulos, G., 2018. Forecasting: Principles and Practice, 2nd Van Zijl, G.M., Ellis, F., Rozanov, A., 2013a. Emphasising the soil factor in geomorpho- edition. Otexts. http://OTexts.com/fpp2/. logical studies of gully erosion: a case study in Maphutseng, Lesotho. South African IUSS Working Group WRB, 2015. World Reference Base for Soil Resources 2014, Update Geographical J. https://doi.org/10.1080/03736245.2013.847803. 2015 International Soil Classification System for Naming Soils and Creating Legends Van Zijl, G.M., Le Roux, P.A.L., Turner, D.P., 2013b. Disaggregation of land types Ea34 for Soil Maps. World Soil Resources Reports No. 106. FAO, Rome. and Ca11 by terrain analysis, expert knowledge and GIS methods. South African J. – Kempen, B., Brus, D.J., Heuvelink, G.B.M., Stoorvogel, J.J., 2009. Updating the 1: 50 000 Plant Soil 30 (3), 123 129. Dutch soil map using legacy soil data: a multinomial logistic regression approach. Van Zijl, G.M., Le Roux, P.A.L., 2014. Creating a hydrological soil map for the Stevenson – Geoderma 151, 311–326. Hamilton Supersite, Kruger National Park. Water SA 40 (2), 331 336. Land Type Survey Staff, 1972-2006. Land types of South Africa: Digital map (1:250 000 Van Zijl, G.M., Bouwer, D., Van Tol, J.J., Le Roux, P.A.L., 2014a. Functional digital soil – – scale) and soil inventory datasets. ARC-Institute for Soil, Climate and Water, Pretoria. mapping: a case study from Namarroi, Mozambique. Geoderma 219 220, 155 161. ff Le Roux, J., Newby, T., Sumner, P., 2007. Monitoring soil erosion in South Africa at a Van Zijl, G.M., Ellis, F., Rozanov, D.A., 2014b. Soil properties a ecting gully erosion in a – regional scale: review and recommendations. S. Afr. J. Sci. 103, 329–335. severely degraded catchment in Lesotho. South African J. Plant Soil 31 (3), 163 172. Le Roux, J.J., 2018. Sediment yield potential in South Africa’s only large river network Van Zijl, G.M., 2019. Digital soil mapping approaches to address real world problems in – without a dam: Implications for water resource management. Land Degrad. Develop. southern Africa. Geoderma 337, 1301 1308. 29, 765–775. Van Zijl, G.M., Van Tol, J.J., Tinnefeld, M., Le Roux, P.A.L., 2019. A hillslope based MacMillan, R.A., Moon, D.E., Coupé, R.A., Phillips, N., 2010. Predictive Ecosystem digital soil mapping approach, for hydropedological assessments. Geoderma 354. Mapping (PEM) for 8.2 Million ha of Forestland, British Columbia, Canada. In: https://doi.org/10.1016/j.geoderma.2019.113888. Boettinger, J.L., Howell, D.W., Moore, A.C., Hartemink, A.E., Kienast Brown, S. Zunckel, K., 2013. Assessment of Ecological Infrastructure for three Potential Dams in the (Eds.), Digital Soil Mapping: Bridging Research, Environmental Application, and Umzimvubu Catchment Area. South African National Biodiversity Institute, Pretoria. Operation. Springer Netherlands, Dordrecht.

8