Soil spectroscopy as a tool to assess organic carbon, iron oxides, and clay content in the Subtropical Thicket Biome of the Eastern Cape province of South Africa

Marco Nocita

02/07/2009

I

II

Soil spectroscopy as a tool to assess organic carbon, iron oxides, and clay content in the Subtropical Thicket Biome of the Eastern Cape province of South Africa

Marco Nocita

Supervisors:

Lammert Kooistra Martin Bachmann

Master thesis project assigned by Living Lands (Secretariat of PRESENCE), South Africa. Co‐funded and supported by German Aerospace Agency – DLR, Germany Gamtoos Irrigation Board – GIB, South Africa Department of Water Affairs and Forestry – DWAF, South Africa

PRESENCE works in collaboration and is supported by Restoration Research Group - R3G, South Africa Council For Scientific and Industrial Research – CSIR, South Africa ASSET Research, South Africa Rhodes University, South Africa Nelson Mandela Metropolitan University, South Africa Wageningen University, The Netherlands Dienst Landelijk Gebied – DLG, The Netherlands Speerpunt Ecosysteemen Landschappervices – SELS (WUR) Facilitated by PRESENCE Participatory Restoration of Ecosystem Services & Natural Capital (Eastern Cape) platform, and EarthCollective

III

Abstract

In the subtropical thicket biome of the Eastern Cape province of South Africa, heavy browsing by goats, which remove shrub biomass more rapidly than it is replaced, transforms the dense closed-canopy shrubland into an open savanna-like system. This transformation causes a lot of changes, among which, soil fertility depletion. This document presents a project dealing with organic carbon (OC), iron oxides, and clay content assessment, in the degraded thicket biome, through the combination of soil spectroscopy and partial least square regression (PLSR) techniques. The study area is a transect crossing in direction south east-north west the Eastern Cape province of South Africa, from latitude -33.57 to -32.59 and longitude 25.38 (eastern extreme) to 25.26 (western extreme). The study area has been selected based on a GIS analyses, realized overlaying vegetation type, rainfall and topography data sets. A total of 113 points have been visited over a distance of 130 km. At every point field spectroscopy measurements were realized and soil samples of the first cm (topsoil) and of the 0-20 cm have been collected. The soil samples have been chemically and spectrally analyzed. The present study models the relationships between soil spectral reflectances, measured in situ and in the laboratory, and the soil parameters taken in consideration. The PLSR models developed with laboratory and field spectra gave good predictions of OC, with a root mean square error of validation (RMSEV) <0.6, and sufficient results for the iron oxides prediction (RMSEV always >0.55). The clay content prediction models didn’t produce enough accuracy. Results indicated that soil stoniness is an important variable to consider for the creation of soil properties prediction models.. The up-scaling process of the OC laboratory and field spectroscopy prediction models to the 232 EnMAP channels gave high level of accuracy, also including the noise component (signal to noise ratio=100). The promising results of this research study will serve as base for the future up-scaling processes of the obtained ground-based regression models to air-borne and space-borne hyperspectral data, in order to cover all the subtropical thicket biome of South Africa.

IV

Acknowledgments

I would like to thank PRESENCE and Earthcollective for the assistance and dedication during the South African time. Special thanks to Silvia and Dieter for their patience and friendship and for the extraordinary short-time organization. I would like to specially mention Saint Edwill Moore from Patensie for believing always in the usefulness of the project and for solving problems just like WOLF in Pulp Fiction. Thanks also to my drivers and my workers, for coming with me around the subtropical thicket biome to get dirty with soil, and to Mike Powell, just to be a GIS magician. Thanks to Andreas Mueller and Martin Bachmann of the German Aero-Space Agency (DLR) for believing I’m not crazy and for giving me credits, possibility (money), and for replying to my phone calls from South Africa like: “Marco, are you feeling alone?”, and to send me the best surfer of Wurzburg University, Christian Huettich, together with the spectrometer. Thanks to Lammert Kooistra for accepting to supervise me. Thanks also to my girlfriend, friend, company for the life, and most beautiful swimmer in Munich, Lily, without that this project would not exist, and this life wouldn’t be so interesting. Last special thanks go to my mother and my brother for their support in difficult moments, and to my father, always with me, always with the guitar.

V

Table of contents

Page

Abstract………………………………………………………………..IV Acknowledgements………………………………………………….V Table of contents……………………………………………………..VI List of figures…………………………………………………………VIII List of tables…………………………………………………………..X Chapter:

1. Introduction……………………………………………………………...... 1 1.1 Background of the Sub-tropical thicket biome……………………………….1 1.2 Problem description……………………………………………………………….5 1.3 Research objective and questions……………………………………………...7 1.4 Report overview……………………………………………………………...... 8

2. Literature review…………………………………………………………….9 2.1 Organic carbon, Iron oxides, and clay content……………………………….9 2.2 Soil spectroscopy………………………………………………………………...10 2.3 Spectral data manipulations and statistical analyses……………………..12 2.4 VNIR-PLRS previous studies for soil properties estimation……………15 2.5 Ground Spectroscopy Up-scaling process to Imaging Spectroscopy………………………..……………………………….16

3. Methodology…………………………………………………………...... 17 3.1 Study area………………………………………………………………………….18 3.2 Data collection…………………………………………………………………….19 3.2.1 Soil sampling campaign…………………………………………………….19 3.2.2 Field spectroscopy campaign……………………………………………..20 3.2.3 Soil samples analyses………………………………………………………21 3.3 Model construction……………………………………………………………….22 3.3.1 Datasets available……………………………………………………...... 22 3.3.2 Spectral preprocessing……………………………………………………..22 3.3.3 PLSR model calibration and validation…………………………………..23 3.4 EnMAP up-scaling………………………………………………………………..24 3.4.1 Spectral resampling…………………………………………………………24 3.4.2 Noise simulation……………………………………………………………..24 3.4.3 Organic carbon model calibration and validation…………………..….25 4. Results……………………………………………………………………….27 VI

4.1 Laboratory analyses……………………………………………………………..27 4.2 Soil spectra interpretation………………………………………………………29 4.3 Models calibrations and validation……………………………………...... 31 4.3.1 Organic carbon……………………………………………………………….31 4.3.2 Iron oxides…………………………………………………………………….35 4.3.3 Clay content...…………………………………………………………………38 4.3.4 EnMAP simulations……………….…………………………………...... 40

5. Discussion…………………………………………………………………..44

6. Conclusions and recommendations……………………………………51

7. References……………………………………………………………...... 53

8. Appendixes………………………………………………………………….60

VII

List of figures

Page Figure 1: Location of the thicket biome in southern Africa…………………………………..2

Figure 2: Methodology overview…………………………………………………………….....17

Figure 3: map of the STB vegetation types…………………………………………………...18

Figure 4: map of the STB topography…………………………………………………………18

Figure 5: map of the STB rainfall…………………………………………………………….. .19

Figure 6: map of the STB study area…………………………………………………………..19

Figure 7: soil sampling scheme……………………………………………………………...... 21

Figure 8: Mean laboratory and field spectra obtained from topsoil and 0-20cm soil samples...... 29

Figure 9: Topsoil laboratory and field spectra of plot 51, 52, and 53...... 30

Figure 10: ASD and resampled EnMAP topsoil laboratory and field spectra of plot 7...... 31

Figure 11: 0-20cm model, without water absorption bands, OC predicted vs. observed values………………………………………………………………………………….33

Figure 12: topsoil field without stones NIR-OC predicted vs. observed values...... 33

Figure 13: topsoil field without stones NIR-OC predicted vs. observed values...... 33

Figure 14: topsoil field with stones NIR OC predicted vs. observed values……………………………………………………………………………....33

Figure 15: variable importance for projection (VIP) representation related to topsoil laboratory (TL) organic carbon prediction model, considering the full spectral resolution (a) , and excluding the water absorption bands (b)………………….…………….. ……………...... 34

Figure 16: variable importance for projection (VIP) representation of topsoil field stone (TFS) organic carbon prediction model, considering the full spectral resolution (a), and excluding the water absorption bands (b) ...... 35

Figure 17: topsoil field stone NIR iron content predicted Vs observed values...... 37

Figure 18: B coefficients (VIP) of topsoil field stones iron oxides prediction model...... 38

VIII

Figure 19: spectral loadings and loadings weights of topsoil field stones iron oxides prediction mode...... 38

Figure 20: EnMAP topsoil laboratory OC predicted Vs observed values...... 41

Figure 21: B coefficients of EnMAP topsoil laboratory OC predicted Vs. observed values...... 41

Figure 22: EnMAP topsoil field, without stones, OC predicted Vs observed values...... 42

Figure 23: spectral loadings and weights of EnMAP topsoil field OC predicted Vs observed value...... 42

Figure 24: EnMAP topsoil field stones OC predicted Vs observed values...... 43

Figure 25: B coefficients of EnMAP topsoil Field stone OC predicted Vs observed values...... 43

Figure 26: fence effect denoting severe land degradation in the STB………………………49

IX

List of tables

Page

Table 1: topsoil chemical analyses results ranges...... 27 Table 2: Pearson correlation coefficient between topsoil and 0-20 cm OC, Fe, and clay content...... 28

Table 3: calibration and validation of the OC prediction models using full spectral data……………...... 31

Table 4: calibration and validation of the OC prediction models using visible and near-infrared spectral data...... 32

Table 5: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions in each of the visible (VIS), near- infrared (NIR), and combined visible–near-infrared (VNIR) regions of the EM spectrum…………………………………………………………………………………………..34

Table 6: calibration and validation of Fe oxides prediction models using full spectral data...... 36

Table 7: calibration and validation of the Fe prediction models using visible and near- infrared spectral data...... 36

Table 8: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions in each of the visible (VIS), near- infrared (NIR), and combined visible–near-infrared (VNIR) regions of the EM spectrum…………………………………………………………………………………………..37

Table 9: calibration and validation of clay content prediction models using full spectral data...... 39

Table 10: calibration and validation of the Fe prediction models using visible and near- infrared spectral data...... 39

Table 11: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions in each of the visible (VIS), near- infrared (NIR), and combined visible–near-infrared (VNIR) regions of the EM spectrum…………………………………………………………………………………………..40

Table 12: calibration and validation of the OC prediction models using EnMAP resampled spectral data...... 40

Table 13: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions using EnMAP resampled and noise simulated spectra………………………………………………………43

X

1. Introduction

Soil, defined by Thompson (1957) as ‘‘the upper layer of the earth which may be dug, plowed, specifically, the loose surface material of the earth in which grow’’, is a complex material that is extremely variable in its physical and chemical composition. It is formed from exposed masses of partially weathered rocks and minerals of the earth’s crust. Soil formation or genesis is strongly dependent on the environmental conditions of both the atmosphere and the lithosphere. The soil body is a product of five factors: climate, time, organisms, topography, and parent materials. The great variability in soils is the result of interactions of these factors and their influence on the formation of different soil profiles (Buol et al., 1973). The properties of a soil reflect the interaction of many soil-forming processes that operate within the soil system. These processes result from an interaction of local conditions, such as climate, geology, topography, and vegetation over time. Properties of soil materials affect the soil character, fertility, ability to support plants, its water-holding capacity, and its biodiversity. The understanding of soil properties makes possible to utilize soils in a well- managed way and to minimize soil loss and damage (Ben-Dor et al., 2008). In the last two decades it was shown that several physical, chemical, and biological properties can be predicted from the diffuse reflectance spectra of soil. Soil spectroscopy, in the visible, near-infrared and mid-infrared ranges (VIS–NIR–MIR) of the electromagnetic spectrum, coupled with robust multivariate calibration, as partial least squares regression (PLSR), have created new opportunities for soil measurement in land resource survey (Janik and Skjemstad, 1995; McCarty et al., 2002). The research presented in this study aims to explore the possibility to apply soil spectroscopy as a tool to estimate soil organic carbon (SOC), iron oxides, and clay content in the Subtropical Thicket Biome of the Eastern Cape province of South Africa.

1.1 Background of the Sub-tropical Thicket Biome The Subtropical Thicket Biome (STB) of South Africa is centred in the south-western part of the Eastern Cape Province (33oS, 25oE), where it is the dominant formation in the central and eastern Little Karoo, and in the major river valleys coastwards of the Great Escarpment (Gamtoos, Sundays, Fish, Kei) (Vlok et al. 2003).

1

Thicket, as a mosaic with vegetation of other biomes (bush clumps and bontveld), is widespread throughout the subcontinent (figure 1). The climate of thicket’s core area is semi-arid to subhumid (250–800mm yr-1) and subtropical to warm-temperate (largely frost- free) (Acocks 1953, Low and Rebelo 1996). STB climate has a bimodal rainfall pattern, with peaks in spring and autumn, although copious rain may fall at any time of the year (Vlok et al., 2003). Unlike savanna, fire is not a component of the STB, due to the absence or low cover of grass, although thicket clumps in fire-prone matrices (thicket mosaics) are fire-affected (Cowling et al. 1997).

Figure 1: Location of the thicket biome (dark shading) in southern Africa (including Swaziland and Lesotho) (Low and Rebelo, 1996). Also shown are the boundaries of other biomes in the region and the distribution of vegetation types (after Low and Rebelo 1996) (light shading) that include thicket vegetation (mostly mosaics) at 1:100000 scale.

Subtropical thicket was recently recognized as a distinct southern African biome (Low and Rebelo 1996). Cowling (1983) suggested a Holocene origin, according to the formation as assembled from plants recruited from adjacent biomes, mainly forest, savanna and karoo, after climatic amelioration at the end of the Pleistocene. Recent phylogenetic data on species, endemic to extant thicket, suggest that STB is part of a semi-arid tropical biome that was widespread across the globe in the early Tertiary, and was pleasantly full of basally-branching taxa that contributed lineages to vegetation formations that became established in the Neogene (Schrire et al. 2005)

2

STB is a dense formation of evergreen and weakly deciduous succulent shrubs (e.g., Portulacaria afra), spinescent shrubs (e.g., Azima tetracantha, Gymnosporia polycantha, Putterlickia pyracantha, Rhus longispina), and low-growing (2–5m) (e.g., Pappea capensis, Euclea undulata, afra) (Cowling et al., 2004). The vegetation matrix has a rich flora, conservatively estimated at about 1600 species, 20% of which are endemic to the STB (Vlok et al., 2003). This vegetation type supports an exceptionally high natural diversity and abundance of large mammals (such as rhinoceros, elephants and antelope) that browse woody shrubs (Skead, 1987; Kerley et al. 1999). It is often intensively harvested by local people for wood, fruit and medicines (Cocks and Wiersum 2003), it can sustain appropriately managed goat pastoralism (Aucamp 1976; Stuart-Hill and Aucamp 1993) and is the centre of a growing tourism activities, such as game farming, watching, safari tours, etc. (Kerley et al. 2002). In addition to these attributes, the vegetation stores unusually large quantities of carbon for a semi-arid region (Mills et al. 2003; Mills and Fey, 2004). Despite a long association with indigenous large herbivores (Midgley, 1991), thicket is surprisingly sensitive to injudicious pastoralism (Stuart-Hill 1992). Unsustained heavy goat browsing can transform the dense closed-canopy shrubland into an open community comprising scattered and degraded thicket clumps and isolated trees in a matrix of ephemeral herbs (Kerley et al., 1995). Particularly vulnerable are drier forms of thicket dominated by the -like leaf succulent, Portulacaria afra (spekboom) (Stuart-Hill, 1992). During the 20th century, of the 16,942 km2 of solid (unbroken canopy) thicket with a substantial P. afra component, 46% has been heavily degraded and 36% moderately degraded by domestic herbivores (Lloyd et al., 2002). Excessive goat browsing of P. afra-dominated thicket transformed this dense vegetation to a “pseudosavanna” where isolated trees of P. capensis and S. afra persist precariously in a field-layer matrix of ephemeral herbs. Mills and Fey (2004) reported that in degraded sites, spekboom – which comprises the bulk of thicket cover – is entirely eliminated, rate of leaf litter deposition is reduced by ~30% (4126 vs 2881 kg dry matter/ha/yr), biomass carbon by ~75% (52 vs 8 t C/ha), soil carbon by ~40% (0–10 cm, 71 vs 40 t C/ha), soil nitrogen by ~30% (0.33 vs 0.24%), and rate of infiltration by ~60% (51 vs 19 mm/h, laboratory test). The effect that the transformation of the thicket biome in an open savannah has on soil properties is not totally clear. An experiment by Mills and Fey (2004) demonstrated that the above biome changes caused a sensible decrease of total carbon, total N, soil respiration 3

(laboratory), laboratory infiltration rate, and medium plus coarse sand. In particular, the loss of approximately half the original soil organic matter during the transformation of thicket into savanna, was due to: • the lack of return of organic matter to the soil via leaf litter, (present as a dark layer often several centimeters thick under P. afra) and the sloughing of roots were probably reduced with a decline in bush cover; • erosion of organic-rich surface soils, probably increased as bush cover declined • increase of the rate of mineralization of soil organic matter caused by the increase of surface temperature, together with reduced rain water interception and increase of the number of wetting and drying cycles in the soil surface. Carbon lose, as a result of vegetation removal, in succulent thicket is approximately 4.0 kg/m2 in soils to a depth of 500 mm and 4.5 kg/m2/yr in biomass (above- and belowground) (Mills 2003; Mills, O’Connor, et al. 2005) In order to halt these degradation processes, the subtropical thicket rehabilitation project (STRP), initiated by the Working for Woodland program of the South African Government, entails planting cuttings of P. afra and other easily propagated succulent plant taxa, typical of STB (e.g., Crassula, Aloe, Euphorbia and Cotyledon spp.), at different densities (1-3 m intervals) and patterns (e.g., clumps and scattered), in all the farms that accepted to shift from grazing activities to restoring the land (Powell et al., 2006) . Besides the reestablishment of the vegetation matrix in the STB, STRP points to replenish soil fertility and, ultimately, through the Clean Development Mechanism (CDM), to enter in the carbon market and to participate to the reduction of carbon dioxide (CO2) emissions. In order to monitor STRP effectiveness, new analytical techniques for rapid sampling and instant determination of soil properties, at the field and regional scale levels, are requested. The research presented in this document is part of the STRP, as it investigates whether visible-near infrared spectroscopy (VNIRS) could be a suitable technique to rapidly quantify various soil characteristics simultaneously.

1.2 Problem description In Sub-Saharan Africa (SSA), detailed information on the soil resource base is generally inadequate for most developmental purposes. In most countries, farm-level information

4 and detailed soil maps are non-existent. When this is coupled to other socioeconomic constraints, it provides one explanation for the lack of progress in poverty alleviation and food security (Cleaver and Schreiber, 1994). The soil survey has traditionally been the resource for estimating soil properties. Shepherd and Walsh (2002) pointed out the tremendous need for new techniques to measure soil properties that are faster and cheaper than the conventional soil laboratory methods, and that could provide rapid and reliable quantification of soil constituents across the landscape, leading to soil mapping at a higher spatial resolution (Packepsky et al., 2001; Ellert et al., 2002; Morgan et al., 2003; Sadler, 2004). Visible and near-infrared spectroscopy (VNIRS), both in the field and in the laboratory, is attracting much interest in the soil science community. It has a number of advantages over conventional methods of soil analyses because it is more rapid, cheaper, and hence more efficient at obtaining the data when a large number of samples and analysis are required. Moreover, it may be used to assess various physical, chemical and biological soil properties simultaneously (McCarty et al, 2002). Under laboratory conditions, where physical parameters remain constant, no atmospheric attenuation exists, and spectral noise is minimal, a soil spectrum variation depends on mineralogy. Under such conditions the empirical relationship between the chemistry and the reflectance properties of powders can provide quantitative information about unknown materials solely from their reflectance spectra (Condit, 1972). Spectra collected under field conditions require several factors to be taken into account, such as the influence of green vegetation and consequently the timing of the reflectance measurements, soil moisture ( reducing the amount of light reflected by soil), illumination type, surface area of field of view (FOV), and soil structure (Kooistra et al, 2003). The use of a contact probe (active sensor) to collect field spectra can minimize the difference between field and laboratory spectral, due to the elimination of vegetation effect, limitation of illumination conditions changes, offering the possibility to work continuously also under unfavorable weather conditions. Soil spectroscopy methods allows the creation of high-resolution map of horizontal soil variability, but is limited in its ability to get high resolution vertical soil information; soil profile (vertical) information is limited by the capacity to collect and analyze soil cores (Waiser, 2007). The soil properties prediction models creation follows consolidated procedures used in previous researches, as creation of training and test set, spectral data manipulation, PLSR 5 calibration, and subsequent validation (Cozzolino and Moron, 2004). At the moment PLSR is the most used statistical technique combined to soil spectroscopy, but several other techniques were used in the past according to the analyzed properties and knowledge availability (Viscarra-Rossel et al., 2006). In order to develop the strongest possible prediction models for the analyzed soil properties, spectral ranges 350-399 nm, 796-814 nm, and 2401-2500 nm, and the water absorption bands (1400 and 1900 nm) could be excluded, due to their insensitive or influenced by artefacts produced by spectrometer (Viscarra-Rossel, 2006). One of the aims of this research study is the development of OC prediction model based on EnMAP spectral resolution. EnMAP is the new hysperspectral senson, that is scheduled to be launched in 2012. Since 1972, when the first commercial satellite (ERTS- 1, known as LANDSAT-1) was placed in orbit, remote sensing of soils has become an attractive tool for assessing and mapping the soil environment from a far distance. Then, tremendous progress has been made in both data acquisition technology and data processing techniques. Goetz and Wellman (1984) indentified hyperspectral remote sensing (HRS) as an advanced tool that provides high spectral resolution data, with the aim of providing near-laboratory-quality reflectance or emittance, for each single picture element (pixel) from a far distance. This information enables the identification of objects based on the spectral absorption features of chromophores and has been found very useful in many terrestrial and marine applications (Ben-Dor, 2003). During the last few years, it has been shown that soil spectra across the visible-near infrared-short wave infrared (VIS-NIR-SWIR; 400–2500 nm) spectral regions are characterized by significant spectral signals that enable quantitative analysis of several soil properties. The complete up-scaling process from ground spectroscopy to HRS needs to consider the effects on the soil reflectance produced by atmosphere attenuation, a varying field of view for every pixel, spectral instability, vegetation, soil crust, surface roughness and pixel size (Ben-Dor et al., 2008). Because most of the applications for soils have been developed for point spectrometry, their immediate adaptation for the imaging spectroscopy (IS) domain requires proper attention and adequate solutions to minimize the above problems.

1.3 Research objective and questions The objective of this study is to calibrate and validate multiple regression models for the prediction of soil organic carbon, iron oxides and clay content, through the application of soil laboratory and field spectroscopy techniques, along a transect crossed in the STB of

6 the Eastern Cape Province of South Africa. The prediction models are constructed considering soil stoniness, different spectral ranges, and optimal numbers of factors to include in the PLSR model calibration. The development of a spectroscopy based approach for monitoring soil properties in a degraded land, as STB, will serve for future up- scaling processes to air-borne and space-borne hyperspectral data.

The following research questions were addressed in this study: • Are there significant differences between the SOC, Fe oxides, and clay contents prediction model derived from topsoil field spectral reflectances and the model developed from topsoil laboratory spectral reflectances? o Is the stoniness of the topsoil influencing the prediction accuracy of the models derived from topsoil field and laboratory spectral reflactances • Are there significant differences between SOC, Fe oxides, and clay contents prediction models derived from topsoil field and laboratory spectral reflactances and the model developed with 0-20cm depth samples laboratory spectral reflectances? • Which spectral range should be chosen to improve the PLS models for the prediction of the considered parameters? • Is the up-scaling process (spectral resampling) to EnMAP spectral resolution producing significant PLSR models for the prediction of SOC?

1.4 Report overview The following chapters will describe the several steps addressed for this research project. Chapter 2 offers the theoretical background and the past research related with soil spectroscopy used as a tool to develop soil properties prediction models, with the first part dedicated to the soil spectroscopy, and the second focusing on the statistics implemented to produce the prediction models. Chapter 3 describes all the methodology implemented, from the fieldwork, characterized by soil sampling and field spectral measurements, to the laboratory analyses, till the data analyses, with its calibration and validation phases. Chapter 4 gives an insight of all the results obtained. Starting with chemical analyses, it proceeds with spectral interpretation and the results of all the models built to predict OC, iron oxides and clay content. The end is dedicated to the results of the up-scaling simulation to EnMAP spectral resolution, with the inclusion of the sensor noise.

7

Chapter 5 discusses about all the results presented in chapter 4, explaining the reason and possible future development that this study might generate. Chapter 6 draws the conclusion, with all the recommendations that one full year of research could suggest for the improvements of the results and the errors to avoid. Chapters 7 lists the references which helped with the writing of the report.

8

2 Literature review

2.1 Organic carbon, iron oxides, and clay content

Soils contain carbon in both organic and inorganic forms. In most soils the majority of carbon is held as soil organic carbon (SOC), with the exception of calcareous soils in which carbon is stored in its inorganic form. Soils vary in the amount of SOC they contain, ranging from less than 1% in many sandy soils to greater than 20% in soils found in wetland or bogs (McVay and Rice, 2002). Soil organic carbon (SOC) has a great influence on the physical (soil structure and porosity), chemical (complexing agent, sorbent), and biological (source of nutrients to plant and microorganisms) properties of soils, and affects the physical characteristics of soils with regard to accelerated soil erosion processes (e.g., hydraulic conductivity and soil structure). SOC also plays an important role in the global carbon cycle (Schlesinger, 1990). Plants convert carbon dioxide (CO2) to organic carbon as they produce stems, leaves, and roots. The cycle of life and death of plants results in accumulation of decomposing plant tissue, both aboveground and belowground (plant roots), and produces a significant amount of SOC (Milne et al., 2008). STB soils form under a shrub forest; they tend to accumulate high level of SOC near the surface and have lower carbon levels in the subsoil, due to the accumulation of leaf litter and decaying wood from limbs and trees that accumulate at the soil surface (Mills et al, 2004). But soil layering is also a function of higher annual rainfall, and the accelerated weathering process that enriches the subsoil with clay. Clay is a naturally occurring material composed primarily of fine-grained minerals, which shows plasticity through a variable range of water content, and which can be hardened when dried and/or fired. Clay deposits are mostly composed of clay minerals (phyllosilicate minerals) and variable amounts of water trapped in the mineral structure by polar attraction. Organic materials which do not impart plasticity may also be a part of clay deposits (Guggenheim and Martin, 1995). Clay content is strongly linked with the soil water holding capacity and directly proportional to the soil organic matter quality and content. Under identical annual organic matter input, a slower organic matter turnover, a larger microbial biomass and more organic matter are expected in soils with a high clay content, compared to soils with a low clay content, within the same climatic area (Reuter, 1991). In classical concepts, stable clay-organic complexes are assumed to be responsible for an increased formation of stabilised organic matter in clay rich soils. 9

Iron (Fe) is one of the most common minerals contained in soils (Hunt, 1980). It is an indicator for the fertility of the soil, in relation to the active action of iron oxides on soil aggregation, affecting infiltration capacity, hydraulic conductivity, water-retention capacity, tilth, gas exchange, organic matter decomposition, and erodibility (Duiker at al., 2003); moreover Fe oxides are important to understand the usability of an area to cultivate specific crops and are an indicator of the age of the deposits (Torrent et al., 1980). Iron oxides, which can be an indicator for soil formation processes and an important parameter for soil classification, may be correlated with soil stability and structure (Ben- Dor, 2008), and offer important indication about drainage condition in the soil. Soil Fe vary greatly with respect to mineral species, concentration and crystal properties. The physical and chemical parameters which influence Fe oxides formation depend on the environmental conditions in the pedosphere, which vary in space and time, e.g. through changing water/air content (Schwertmann and Cornell, 2003). The Fe oxides content of a soil may vary between <1 and several hundred g kg-1, depending on the type and Fe content of the parent rock and on the maturity of the soil. As soil develops, more and more of the original Fe-bearing minerals decompose and most of their Fe is precipitated as pedogenic Fe oxides (McFadden and Hendricks, 1985).

2.2 Soil spectroscopy Spectroscopy is the study of light as a function of wavelength that has been emitted, reflected, or scattered from a solid, liquid, or gas (Clark, 1999). When light strikes a material, light is absorbed, reflected, or transmitted, and spectral measurements can quantify the amount of light reflected or transmitted (Workman and Shenk, 2004). Reflectance spectroscopy measures the scattering of light reflected at all angles from a surface. When the diffuse reflectance of a material is measured, the absorbance bands provide information about the material’s molecular composition. Absorbance peaks are viewed as valleys of the spectral signature when presented as reflectance. The three key parameters in a spectrum that are important are the following: 1) the wavelength at which peaks occur, 2) the amplitude of the peak compared with a 100% reflected or transmitted standard, and 3) the bandwidth, which refers to the broadness of the peak (Workman and Shenk, 2004). Visible to mid-infrared (MIR) spectroscopy has been used to quantify soil properties with varying accuracies and results. The visible (VIS), near-infrared (NIR), and MIR spectral

10 ranges are 350 to 700 nm, 700 to 2500 nm, and 2500 to 25000 nm, respectively. McCarty et al. (2002) have shown that when measuring organic and inorganic C, the MIR region 2 produced higher R values and lower root mean squared deviations than NIR. Mid-infrared spectroscopy works better because the fundamental absorptions of interest in soils exist in the MIR range (McCarty et al., 2002). However, though MIR has been proven to give better prediction accuracies, MIR is less feasible for field and laboratory studies because of cost, portability, and required sample preparation. Hence, the majority of spectroscopy research in soils has occurred in the visible-near infrared (VNIR) regions. VIS region gives information on Fe oxides like hematite (Gaffey et al., 1993), while the NIR region is 2- 2- - dominated by vibration overtones of SO , CO , and OH and combination bands of H O 4 3 2 and CO (Clark, 1999; McCarty et al., 2002). Though the overtones and combination 2 bands in the NIR spectrum are more indirect signatures of key soil constituents, VNIR spectrometers are smaller, cheaper, and field portable as compared to MIR units (Janik et al., 1998). In the last years, it has been shown that VNIRS offers a rapid, and non destructive technique to quantify soil properties, compared to the traditional chemical analyses; VNIRS, requiring less sample preparation, with less or no chemical reagents, is highly adaptable to automated and in situ measurements, and has the potential to analyze various soil properties simultaneously (McCarty et al., 2002; Viscarra-Rossel et al., 2006). For example, under laboratory conditions, coupling VNIRS with multivariate calibration, can accurately determine the organic matter (Reeves et al., 2005; Salgo et al., 1998) and clay content (Ben-Dor et al, 1995; Kooistra et al, 2001) of the soil. Most VNIR studies are conducted under controlled laboratory conditions, but investigations done in situ (Daniel et al., 2003; Kooistra et al., 2003) also have produced promising results.

2.3 Spectral data manipulation and statistical analyses Before the reflectance measurements can be used for model calibration, the spectral data require some manipulation, as transformation, preprocessing and pre-treatment. Manipulation of spectra using derivatives and transformations to log space enables the enhancement of weak spectral features as well as minimizes physical effects (Demetriades-Shah et al., 1990). During the last 10 years, this methodology has been

11 widely developed. Basically, this technology was adopted from a strategy developed about 40 years ago in the food science discipline (Ben-Gera and Norris, 1968). In order to reduce non-linearity’s in the spectra, the transformation from reflectance R to log (1/R) is commonly realized (Viscarra-Rossel, 2008). Spectral normalization can be performed using multiplicative spectral correction (MSC), in order to correct for light scattering variations in reflectance spectroscopy (Geladi et al., 1985), or through the standard normal variate (SNV) transformation (Barnes et al., 1989), used to remove interferences due to light scattering and path length variations. Random noise reduction and signal to noise ratio (SNR) improvement is generally realized using the Savitzky-Golay filter (Savitzky and Golay, 1964), uses a moving polynomial fit of any order and the size of the filter consists of (2n + 1) points, where n is the half-width of the smoothing window. The points between the 2n's are interpolated by the polynomial fit. Spectral resolution enhancement and background effects elimination is obtained widely with differentiation (Reeves III et al., 1999). First derivatives remove additive constant background effects while the second derivative removes baseline linear slope variations and additive effects. The use of spectral data manipulation was developed especially to improve the performance of hyperspectral remote sensing data, collected using natural illumination, while laboratory spectroscopy data, collected under controlled conditions, were not treated (Tsai and Philpot, 1998). However, data preprocessing techniques, together with band selection, are widely applied in soil spectroscopy (Kooistra et al., 2003, Viscarra-Rossel, 2006). Once spectral data have been processed, the extraction of useful information contained in optical spectra can be done using chemometric techniques such as multiple linear regression (MLR), partial least squares regression (PLSR), as well as principal component analysis (PCA). Multiple linear-regression (MLR) was developed by Bengera and Norris (1968a and 1968b) to determine moisture content in soybeans using NIR spectra. Multiple linear- regression handles nonlinearity by adding more terms to the equation. The added terms are chosen and scaled so that the nonlinearities cancel while a net linear sum is maintained:

C = e0 + b1*A1 + b2 * A2 + … + bn * An (Eq. 1)

12 where C is concentration, An is absorbance at wavelength n, e0 is correction factor for interfering absorbance’s, bn is coefficient. This implicit compensation allowed Norris to use diffuse reflectance on ground grain and oil , providing simplicity of sample preparation and presentation that is a singularly important reason for the wide acceptance of NIR analysis. The multiple linear-regression methods depend on selection of a limited set of analytical wavelengths and determination of suitable calibration coefficients to be applied to the data at these points. Alternative approaches, such as PCA and PLSR utilize all or large portions of the spectrum. Principal components regression (PCR) is a two-stage process. First, it minimizes the number of independent components required to describe the variations across the entire spectrum and between spectra. This technique enables several thousand spectral points to be reduced to a few principal components (PCs), where the PCs describe the spectral variance across all the samples. Second, these PCs are regressed against known property data (measured concentration), and then calibration models constructed. These models are validated using separate independent and well-characterized samples to ensure they are robust enough to be used to predict property data from spectral information (Brereton, 1990). PLSR is the most common in soil science literature (Janik et al., 1998; Reeves III, 1999; Reeves III and McCarty, 2001; Dunn et al., 2002; Lee et al., 2003). PLSR is an orthogonal data compression method that allows researchers to look at two dimensional spectra in multidimensional space. The advantage of looking at the data in multidimensional space is that some of the redundancy is removed, patterns can be described from the center of the spectra, and the distances between peaks can be quantified (Workman and Shenk, 2004). PLSR is used to construct a predictive model with many factors (also called predictors or X-variables) that are highly collinear (Tobias, 1995; Wold et al., 2001). One advantage PLSR has over MLR and PCA is that PLSR is more robust, meaning that the calibration model changes little with new calibration samples (Geladi and Kowalski, 1986). There are several assumptions with PLSR. The first assumption is that the model is built from a small number of latent variables (Wold et al., 2001). These latent variables are the most important points to the model and carry more weight in determining the predicting property. The concept of latent variables allows the assumption that the X and Y variables are not independent, allowing for a few spectrum versus the spectra to predict some property of a given material (Wold et al., 2001). The second assumption is that a 13 multidimensional function- F(u,v)- is created from the X and Y data (Wold et al., 2001). The u-vector describes changes in the observations and the v-vector describes changes within the spectra variables (Wold et al., 2001). A third assumption is made concerning homogeneity of samples, meaning that the parameters that influence X on Y stay the same (Wold et al., 2001). PLSR starts by converting the X-variables (X, spectral data) into two vectors called X- scores (T) and X-loadings (P’), which makes PLSR similar to PCA (Geladi and Kowalski, 1986). The Y-variables (Y, soil laboratory data) are treated the same way by creating Y- score (Q’) and Y-loading (U) vectors. The following formulas express these two outer relationships, X = TP’ + E and (eq. 2) Y = UQ’ + F*. (eq. 3) The E and F* values are errors or the residuals. An inner relationship which links the X and Y blocks together is U = bT, (eq. 4) where b is the regression coefficient. These three equations are used in PCA, but in PLSR the P’ value is replaced by weights (W’). Weights have to be used in PLSR because the order of operations is changed, and otherwise the orthogonal t-values would not get calculated (Geladi and Kowalski, 1986). Equations 2 and 3 are combined to give the mixed relationship in PLS where U’ is a row vector and B is b b (Geladi and Kowalski, 1986), 1 2 W’ = U’X/U’U and (eq.5) Y = TBQ’ + F. (eq.6)

2.4 VNIR-PLRS previous studies to quantify OC, Fe oxides, clay content and other soil properties Since 1980’s there were many experiments which tested the possibility to apply laboratory and field spectroscopy for the simultaneous assessment of various soil properties. For Example Dalai and Henry (1986) estimated OC, total N and moisture content, coupling NIR laboratory spectroscopy and MSR, obtaining an R2 of 0.93. McCarty et al. (2002) coupled MIR and NIRS with PLSR to quantify total, organic and inorganic carbon in samples representing 14 soil series collected over a large region in the West Central United States. The quite good results obtained (R2: 0.82) were confirmed by other research studies such as the analyses of soil C and N using NIRS by Chang and

14

Laird (2002), that correlated NIR spectra of the samples with measured values of organic C, inorganic C, total C, total N, and C:N ratios using partial least squares regression. Cozzolino and Moron (2003) used laboratory VNIRS with PLSR to estimate silt, sand, clay (R2: 0,86), calcium (Ca), potassium (K), sodium (Na), magnesium (Mg), copper (Cu) and Fe (R2: 0.90) of 332 samples of different soils from Uruguay (South America). The application of soil spectroscopy, combined with laboratory VNIRS, found application in the assessment of organic matter (OM) and clay content in a floodplain along the river Rhine in the Netherlands (Kooistra et al., 2003). The relation between observed and predicted values, established with PLSR, gave quite good results for OM (R2: 0,69) and clay content (R2: 0,92) under laboratory conditions, while field spectroscopy gave less accurate results. Viscarra et al. (2006) showed as using VIS, NIR, MIR or combining them, with PLSR, could favor the simultaneous assessment of different soil properties in Australian soils, while Islam et al. (2003), used ultra-violet, VIS, and NIRS to assess, in combination with PCA, several soil properties, among which OC, clay and Fe.

2.5 Ground Spectroscopy Up-scaling process to Imaging Spectroscopy Numerous studies showed the abilities of field and laboratory spectroscopy for the prediction of soil properties, but there is lack of information related to the application of remote spectroscopy for the high scale soil mapping. Ben-Dor et al (2002), based on airborne HRS data, used the VNIR approach to map several soil properties. More recently, Selige et al. (2006), based on airborne HyMap scanner data, developed a methodology to quantify topsoil organic matter and texture in a rapid and non-destructive manner. Stevens et al. (2008) used airborne VNIR-MIR spectroscopy to monitor SOC in croplands at regional scale. Gomez et al. (2008) compared SOC prediction models obtained with Hyperion hyperspectral satellite (400-2500 nm) and VNIR field spectra, obtaining more accurate models with field data than satellite data, due to the noise of Hyperion spectra and the spatial resolution (30 m) of Hyperion sensor. IS has drawbacks relative to point spectrometry, such as a low signal-to-noise ratio, atmosphere attenuation, a varying field of view for every pixel, spectral instability, a low integration time for a given pixel, a spectral mixing problem, optical shifts from one pixel to

15 another, and bidirectional reflectance distribution functional (BRDF) effects (Ben-Dor et al., 2008).

16

3. Methodology

The methodology used in this research can be divided in two parts: fieldwork and soil analyses, and data analyses and models development (fig.2). Before going in the field the selection of the study area was realized based on a GIS analyses (see 3.1). The fieldwork, which characterized the Internship of my MSc (as resumed in the relative report), was divided in soil sampling and field spectroscopy campaign (see 3.2) during which all the soil samples and the spectral data have been collected.

Rainfall Topography Vegetation

GIS analyses

Transect selection Fieldwork and soil analyses Fieldwork

Soil sampling Field spectroscopy campaign campaign

Topsoil Field Topsoil samples collection 0-20 cm soil samples spectroscopy data

Chemical Lab Spectral Chemical analyses analyses Lab Spectral analyses analyses

Data Preprocessing

Dataset Division Data analyses and

models Training set development

PLSR, Multiple and Linear Regression,

Test set Calibration model

validation

0-20 cm samples lab Topsoil field Org C, Fe Topsoil lab Org C, Fe Oxides, Clay, Topsoil and 0-20cm Oxides, Clay Org C, Fe Oxides, Clay, prediction model Org C upscaled models to prediction model prediction model EnMAP band channels

Figure2: Methodology overview

17

Soil chemical analyses were realized in South Africa while laboratory spectral analyses were realized in Munich, Germany. The obtained datasets were pre-processed and used to calibrate and validate the OC, Iron oxides and clay content prediction models (see 3.3). The OC prediction models, both from laboratory and field spectral data, were used to simulate the up-scaling scenario to EnMAP spectral channels (see 3.4).

3.1 Study area The study area, located in the STB, was characterized by a transect crossing in direction south east-north west the Eastern Cape province of South Africa from latitude -33.57 to - 32.59 and longitude 25.38 (eastern extreme) to 25.26 (western extreme). The selection of the study area was based on a GIS analyses which took in consideration three available datasets: vegetation type, rainfall and topografy. Vegetation map indicates 21 vegetation types (figure 3), while topography shows an altitude range from 100 to 1100 m a.s.l. (11 classes) (figure 4).

Figure 3: map of the STB vegetation types Figure 4: map of the STB topography

Rainfall dataset covered the full Eastern Cape province giving a rainfall range between 200 to 1000 mm/yr (8 classes) (figure 5). The three datasets were overlayed, obtaining a

18 map with 31 stratified biome classes. The transect covered 21 out of 31 biome classes.113 points were visited over a distance of about 130 km (figure 6).

Figure 5: map of the STB rainfall Figure 6: map of the sampled points in the STB

The choice to select a transect as study area, along the STB, was based on i) the possibility to cover the largest possible number of classes originated with the GIS analyses, and ii) on a future flight campaign, for the collection of hyperspectral data, which will follow the same line traced by the studied transect; in fact, the next step will be to use the soil spectral library created for this project, to build soil properties prediction models, which will be up-scaled to airborne hyperspectral data.

3.2 Data collection The data used for the prediction models construction were collected during the soil sampling and the field spectroscopy campaigns.

3.2.1 Soil sampling campaign

19

During the soil sampling campaign, 113 points were visited (appendix 1) and soil samples collected according to the following scheme: a 20x20x20 cm hole was digged, the soil collected in one bag, well homogenized and divided in two bags, one with soil destined to soil chemical analyses, and one for soil laboratory spectral analyses. As stated before, the soil collected was referred to a depth up to 20 cm, but in some cases, due to shallow stones, it was not possible to reach the 20 cm depth. For every visited point an accurate site description was made including the following characteristics: • Coordinates taken with a high-precision (mm) OmniStar GPS • Land Use (based on interviews done to farmers) • Vegetation type (most important species), and vegetation cover (visual estimation of the % of soil covered by green vegetation). • Soil Moisture, colour, texture, structure, and consistence (all these evaluations were done based on visual estimation and soil description after hand-contact) • Roots content estimation For every visited point several pictures of the sampled plot and of the surrounding environment (vegetation cover, species, and slope) were taken in order to complete the site description.

3.2.2 Field spectroscopy campaign One month after the soil sampling campaign, the field spectroscopy campaign was realized visiting 111 plots out of 113 (points 62 and 63 not visited for logistic reasons) (see appendix 1). Soils spectral reflectance were collected with an ASD Fieldspec-Pro radiometer, in 1 nm steps in the 350-2500nm wavelength range, using a contact probe device in order to: - Eliminate the effect of vegetation on the soil spectra collected - Minimize the light condition changes effects, significant in case of application of normal bare fibre - Reduce the risk of bad weather conditions (risk of major precipitations in November and December in the Eastern Cape province of South Africa).

For every plot, a 50x50 cm plot was defined, and ten spectral measurements have been collected, 5 with stones layer and 5 without, due to the constant stoniness of the sampled

20 soil. In order to reduce the problems related with the pixel size issues of future up-scaling process to airborne and space-borne hyperspectral images, the sampling points were located in areas with either full vegetation cover or bare soil. The five measurements, to repeat twice, followed the scheme 1 in the centre and four at the corner of the plot (fig. 7).

Figure 7: soil sampling scheme

Afterwards, from the same spot, a sample was taken from the upper surface (till 1cm), homogenized, and divided in two bags, one for chemical analyses and one for laboratory spectral analyses.

3.2.3 Soil samples analyses All the soil samples collected during soil sampling and field spectroscopy campaigns were chemically analyzed in order to establish OC, Fe oxides and clay contents.

OC was determined using the Walkley and Black method (Walkley and Black, 1934). The mechanical clay content was identified with the segmentation procedure (grain size<2mm) (Baize and Jabiol, 1995), while the iron content was assessed measuring the concentration of the dithionite- extractable iron oxides (Agbnenin, 2003) Subsequently, the same samples collected during soil sampling and field spectroscopy campaigns were spectrally measured under laboratory controlled conditions, with the same spectrometer used in the field. The topsoil samples were analyzed with and without stones, in order to determine the influence of the stone layer on the soil spectral reflectance, as assessed with the field spectrometer, while the 0-20cm soil samples were analyzed without stones. The samples without stones were analyzed after creating sub- sampling portions of the ground (<2 mm soil, ~20 g) (Viscarra-Rossell, 2006). All the samples were illuminated with two-quartz halogen lamps (1000 W each), mounted on a

21 tripod of zenith angle of 300. The reflected light was assessed in nadir position. Four measuments were taken, rotating the sample clockwise with an angle of 900.

3.3 Models construction 3.3.1 Datasets available Soil spectral libraries were created from: • 0-20cm laboratory spectral data (113 samples) • topsoil laboratory spectral data (111 samples) • topsoil field spectral data (111 samples) • OC, Fe oxides, and clay content topsoil and 0-20cm chemical analyses results (113 samples) The spectral libraries, coupled with PLSR, were used to develop organic carbon, iron oxides, and clay content, diffure reflectance spectrosopy (DRS) prediction models.

3.3.2 Spectral pre-processing All the spectra collected in the field and in the laboratory were corrected for the ASD “jump” at 1000 nm using the additive correction method, for spectralon reflectance, and averaged for subsequent proceedings (McCarty et al, 2002). Based on chemical analyses results, the datasets were divided in training (2/3) and test set (1/3); the datasets were sorted for all the three soil properties considered from the lowest to the highest % amount; then the test set was created collecting a sample every three, while the rest of the samples composed the training sets (appendixes 4 and 5). Prior to performing the statistical analyses, spectra from 350 to 399 nm, from 796 to 814 nm, and from 2401 to 2500 nm were excluded, as insensitive or influenced by artefacts produced by spectrometer (Viscarra-Rossel et al., 2006). Several pre-processing techniques, commonly used in soil spectroscopy, were applied for the enhancement of spectral features. Calibration models for OC, Fe oxides, and clay content were developed applying spectral data pre-processing below resumed: - transformation of Reflectance (R) spectra in log (1/R), to reduce possible spectra non-linearity’s - spectral normalization performed using multiplicative spectral correction (MSC), in order to correct for light scattering variations in reflectance spectroscopy (Geladi et al., 1985);

22

- random noise reduction and signal to noise ratio (SNR) improvement realized using the Savitzky-Golay filter (Savitzky and Golay, 1964), with a second order polynomial fit and a variable window size of either 3, 6, and 10; - spectral resolution enhancement and background effect elimination with first and second derivatives application; - data pre-treatment using mean-centre function The application of the pre-processing techniques was realized differently based on the analyzed data set, due to the conditions (laboratory and field) they were collected, and the result of the cross-validation. In case of unsatisfactory result (inaccuracy of the cross- validated model), more pre-processing techniques were applied.

3.3.3 PLSR model calibration and validation The models were developed using Partial Least Square Regression (PLSR) techniques (Cozzolino and Moron, 2003). The number of factors to take in consideration for the PLSR analyses was decided based on a leave one-out cross-validation (CV) approach to the training set (Reeves et al., 2002). For the selection of an optimal, parsimonious PLSR model different factors were taken into account: - the root mean squared error of the cross validation (RMSECV), for the accuracy of CV - coefficient of determination of the cross validation (R2CV), for how well the model explains the data and predicts new observations (Wold and Sjöström, 2001) - Akaike Information Criterion (AIC), representing the variability in the data without causing it to overfit (Li et al., 2002) - The smallest possible number of factors The test set was used for the models validation; R2 between measured and predicted values for the soil parameters, and root mean square error of the validation (RMSEV) were used to evaluate the established model (Kooistra et al., 2003); moreover, the ratio of performance to deviation (RPD) was used to evaluate the prediction ability of OC, Fe oxides, and clay content built models. Chang and Liard (2002) defined three classes of RPD: category A (RPD>2) are models that can accurately predict the property in question; category B (RPD between 1.4 and 2) is an intermediate class which regroups models that can be possibly improved; category C (RPD<1.4) have no prediction ability. In this research study we referred to this classification.

23

For topsoil samples, superficial stoniness effect on spectral reflectance, were tested for both laboratory and field spectral data. Models accuracies were tested both including and excluding water absoprtion bands in all the spectral datasets. The development of the best prediction models, for the considered soil properties, was based on wavelenght range selections - VIS: 400-795, NIR: 815-2400, VNIR: 400-2400- (Viscarra-Rossel, 2006), while an evaluation of the bands included in the prediction models development was realized based on the variable of importance for projection (VIP), B coefficients and spectral loadings interpretation (Wold et al., 2001). The ParLes 3.1 software was used to develop the models (Viscarra-Rossel, 2008)

3.4 EnMAP spectral resampling and models development The Hyperspectral Earth Observation Satellite for environmental mapping and analyses program (EnMAP) is the future German hyperspectral satellite mission with over 200 channels within the broad spectral range from 420 nm to 2450 nm and a ground resolution of 30 m. EnMAP will find application in global determination of spectral highly resolved ecosystem parameters as well as biophysical, biochemical and geochemical variables. Due to the evident lack of data, the up-scaling process simulation of field and laboratory spectral data used in this research study is characterized by spectral resampling to EnMAP spectral resolution and Gaussian noise simulation.

3.4.1 Spectral resampling The spectral resampling was realized using the topsoil spectra collected both in the field and in the laboratory. The resampling process was characterized by the transformation of the 2150 bands of the full resolution spectra collected with ASD spectral device, from 350 to 2500 nm (1nm step), in the 233 bands of EnMAP, from 420 to 2450nm , with 6-10 nm band ranges (6nm till 900nm, 10 nm till 2450).

3.4.2 Noise simulation In order to simulate as much realistically as possible the EnMAP resampling process a Gaussian noise component was added in the form of acting on the SNR. EnMAP characteristics indicate SNR of about 500:1 in the VNIR and about 150:1 in short-wave infrared (SWIR). The noise simulation realized in this research study foresaw the application of a SNR of 100:1 for all the EnMAP channels. The decision to use this level of noise was based on the need to demonstrate that it is realistic to produce soil properties 24 prediction models based on EnMAP data. Both spectral resampling and noise simulation were realized in IDL-ENVI environment

3.4.3 Organic carbon model calibration and validation The spectral data obtained after spectral resampling and noise simulation were used to build OC prediction model. The decision to simulate the up-scaling process just for OC prediction models, was due to the high accuracies results generated with field and laboratory spectral data, and STRP interests in South Africa, in terms of carbon sequestration potentialities in the STB. The re-sampled datasets were manipulated as described in 2.3.2. PLSR models for the prediction of SOC were constructed based on the methodology proposed in 2.3.3, except for the water absorption bands (WAB) effects analyses and VIS, NIR and VNIR wavelength ranges selection. The ParLes 3.1 software was used to develop the models (Viscarra-Rossel, 2008)

25

4. Results

4.1 Laboratory analyses

The chemical analysis results for topsoil and 0-20cm samples (table 1) show wide ranges of concentration for the analyzed properties. Clay concentration ranges for topsoil is less wide than the same in the 0-20cm layer, indicating that lack of vegetation and rainwater percolation produce phenomena of lisciviation of clay particles, which accumulates in the deeper horizon.

Table 1: statistics of 0-20cm and topsoil samples chemical analyses for Clay, OC and Iron Oxides

Statistics Clay% OC% Fe% 0-20 topsoil 0-20 topsoil 0-20 topsoil

mean 16.65 12.69 1.32 1.35 2.77 2.56 max 44.60 39.15 5.05 6.03 7.28 6.90 min 3.94 0.90 0.20 0.18 1.35 0.73 median 16.56 11.71 1.17 1.00 2.60 2.46

stdev 7.12 7.24 0.91 1.25 0.88 0.98 OC: organic carbon; Fe: iron oxides

Different trend for OC, characterized by similar mean concentrations for topsoil and 0-20 layers. Topsoil OC shows maximum higher than 0-20cm, but the two layers basically show similar mean contents. Iron oxides concentration (Fe) does not differ significantly between topsoil and 0-20cm layer; the most important difference is observable between the minimum Fe of 0-20cm, which is almost the double of topsoil minimum, again due to the rainwater percolation action; Fe particles are transported in the subsoil and the subsequent drought creates the reducing conditions for the creation of Fe3+, evidenced by the reddish colour of 0-20cm layer observed during the fieldwork. Strong correlations were observed between topsoil and 0-20cm for OC (0,77) and Fe (0,68), indicating that topsoil could be representative for the deeper layer for these two properties (table 2). The weak correlation between topsoil and 0-20cm clay (0,28) could be caused by land-use, characterized in this area by either goat farming or game reserve activities. Soil has not undergone intense ploughing activities, which could cause the stability of OC and Fe from topsoil to 0-20cm. Inversely, significant differences of clay were

26 favoured by alluviation deposit phenomena and constant transport of small particles from the topsoil to the under layer.

Table 2: Pearson correlation coefficient between topsoil and 0-20 cm OC, Fe, and clay content Pearson Correlation Coefficient 0-20 Topsoil OC Fe Clay OC 0.77 -0.02 0.17 Fe 0.14 0.68 0.12 Clay 0.30 0.03 0.28 OC: organic carbon; Fe: iron oxides

The significantly weak correlations between OC, Fe, and Clay (table 2) explains the choice to create 3 different training and test set for both topsoil and 0-20cm soil samples (appendix IV and V) and to build independent prediction models for the three analyzed properties. The results of the chemical analyses suggest different trends of OC, Fe and clay in relation to the vegetation cover. For example, plot 5 and 6 were located in the same area (see appendix I). The visual estimation of the vegetation cover was different, as plot 6 presented almost a pristine environment, while plot 5 was characterized by scarce vegetation cover. The chemical analyses indicate a much higher topsoil and 0-20cm OC in plot 6 than in plot 5 (see appendix II). At the same time, clay and Fe of plot 5 are higher than plot 6, pointing out remarkable differences, especially for 0-20cm samples. Same trend was observed for plots 51, 52, and 53, all located in Kolkfontein farm (see appendix I). Plot 51 was characterized by scarce vegetation cover, and the chemical analyses showed low OC both for 0-20cm (1,17%) and topsoil (0,88%). Plot 52 visual estimation revealed a higher vegetation cover than 51; as a consequence plot 52 gave higher 0-20 and topsoil OC than plot 51 (see appendix 2 and 3). Plot 53 was characterized by pristine vegetation, showing 0-20cm and topsoil OC of 2.1 and 2,54% respectively (appendixes II and III) . Our observation confirmed the trend suggested by Mills and Cowling (2006), according to which the vegetation matrix of STB enriches the soil with an exceptional amount of carbon for a warm, semiarid region, and this effect is evident for both topsoil and deeper layer.

27

4.2 Soil spectra interpretation In order to give an overview of the differences of spectral reflectance between i) laboratory and field spectra, ii) topsoil and 0-20 cm soil samples, and iii) samples with and without stones, mean spectral values of all the samples measured in the field and in the laboratory were calculated and the results are shown in figure 8. The soil spectra measured on the samples where stoniness is not considered (0-20, tlnos, tfnos) show higher levels of reflectance and less absorption peaks than the samples measured including the stones. The type of stoniness present in the STB is characterized by small (average diameter < 2cm) dark stones, covering the topsoil and causing significant differences between the topsoil field reflectance without stones (tfnos) and the same with stones (tfs) (fig.8). In order to build a strong prediction model, for any type of soil property, stoniness should be a factor to consider, especially in prevision of an up- scaling scenario, where stoniness would produce a significant reduction of light reflected and detected by the sensor.

Figure 8: Mean laboratory and field spectra obtained from topsoil and 0-20cm soil samples

0-20: 0-20 cm samples mean laboratory spectra; TFNOS: topsoil samples mean field spectra without stones; TFS: topsoil samples mean field spectra with stones; TLNOS: topsoil samples mean laboratory spectra without stones; TLS: topsoil samples mean laboratory spectra with stones

The highest reflectance value was measured under laboratory conditions and without stones (tlnos) (figure 8). The difference between tlnos and tfnos is smaller than the expected, probably due to the contact probe device implementation, which reduces the gap between lab and field in terms of light stability, as proofed by the similar reflectance values of tfnos and 0-20 cm laboratory spectra (0-20). As expected, the average tfs spectra values are sensibly lower than the topsoil laboratory spectra with stones (tls), indicating that, under laboratory controlled conditions, the stones do not produce the same reduction of light reflected by the soil measured in the field. 28

The differences in terms of soil chemical contents among plot 51, 52 and 53, detected with the chemical analyses (par. 4.1 and appendixes II and III), are shown also by the topsoil spectral reflectance values (fig. 9). The trend observed in figure 8 is evident also in figure 9, pointing out the higher reflected light detected by the sensor under controlled laboratory conditions than in the field. As mentioned before, plot 51 was characterized by lower vegetation cover and topsoil OC than plot 52 and plot 53 (see 4.1). This observation is also proved by the higher amount of light reflected by sample 51 than 52 and 53, confirming the theory according to which the higher the OC the lower the spectral reflectance value (Kooistra et al, 2003)

Figure 9: Topsoil laboratory and field spectra of plot 51, 52, and 53

This difference is more pronounced for field than for laboratory spectra, evidencing as field conditions emphasize the spectral gaps caused by the different soil properties. The up-scaling simulation to EnMAP was characterized by the resampling of ASD laboratory and field spectra to EnMAP spectral resolution plus the introduction of the highest possible noise component (see 3.4). Plot 7 resampled spectra are characterized by wide variations respect to the ASD original spectra (figure 10), evident for the full spectral bands. The differences are more pronounced in the NIR range, while are almost null in correspondence of the water absorption bands, due to EnMAP spectral resolution. The figure is also pointing out the lower reflectances values of tfs data than tfnos and tl spectra.

29

Figure 10: ASD and resampled EnMAP topsoil laboratory and field spectra of plot 7

4.3 Models calibration and validation 4.3.1 Organic carbon Calibration and validation of 0-20 cm OC prediction models produced equal accuracy both with and without water absorption bands (0-20 and 0-20_NWAB respectively), indicating a R2 higher than 0.8, with the same data manipulation (table 3). Topsoil laboratory (TL) prediction model calibration gave a RMSECV of 0.458, slightly lower compared with the same model produced excluding the water absorption bands (TL_NWAB); at the same time, with 4 factors selected for the PLSR, TL_NWAB produced better results in validation phase. Both models, constructed after the same data manipulations, do not indicate remarkable accuracies´ differences (table 3). Topsoil field no stones model without water absorption bands (TFNS_NWAB) was built including 9 factors. It gave less error than the topsoil field no stones model with water absorption bands (TFNS), but the latter one was built with 8 factors and, once validated, produced less error and higher R2 than the former one; TFNS_NWAB received mean centre pre-treatment while TFNS was not pre-treated, because it gave better calibration accuracy (table 3). Topsoil field stone (TFS) and TFS without water absorption bands (TFS_NWAB), after receiving the same data manipulation, produced similar results both for calibration and validation. TFS_NWAB was built with 7 factors while TFS only with 5 factors (table 3). The development of the best OC prediction models required the analyses of the separate VIS and NIR parts of the spectrum (Viscarra-Rossel, 2006) (table 4).

30

Table 3: calibration and validation of the organic carbon prediction models using full spectral data spectral data data manipulation Calibration Validation R to Log De- Pre- PLSR training test (1/R) noising different treatment R2CV RMSECV factors R2 RMSEV RPD Mean 0-20 76 37 x SG(2, 6) 1st deriv center 0.86 0.359 5 0.826 0.356 2,27 Mean 0-20_NWAB 76 37 x SG(2, 6) 1st deriv center 0.86 0.359 5 0.827 0.354 2.29 Mean TL 75 36 x SG(2, 6) 1st deriv center 0.87 0.458 5 0.831 0.41 2,33 Mean TL_NWAB 75 36 x SG(2, 6) 1st deriv center 0.866 0.464 4 0.83 0.405 2,36 TFNS 75 36 x SG(2, 6) 0.81 0.557 8 0.822 0.432 2,22 Mean TFNS_NWAB 75 36 x SG(2, 6) center 0.816 0.547 9 0.797 0.469 2,04 Mean TFS 75 36 x SG(2, 3) 1st deriv center 0.719 0.651 6 0.727 0.567 1.69 Mean TFS_NWAB 75 36 x SG(2, 3) 1st deriv center 0.716 0.681 7 0.714 0.588 1.63 0-20: 0-20cm lab spectra model; NWAB: NO water absorption bands; TL: topsoil lab spectra model; TFNS: topsoil field spectra model without stones; TFS: topsoil field spectra model with stones

The general trend indicate as the NIR spectra gives better results both for calibration and validation, with some exceptions. The 0-20cm VIS spectra model (0-20_VIS) gave higher R2CV and lower RMSECV than the 0-20cm NIR spectra model (0-20_NIR), but, once validated, the latter one produced higher accuracy (RMSEV: 0,358).

Table 4: calibration and validation of the OC prediction models using visible and near-infrared spectral data spectral data data manipulation Calibration Validation R to Log De- PLSR training test (1/R) LSC noising differen Pretreatment R2CV RMSECV factors R2 RMSEV RPD 0-20_VIS 76 37 x MedFil 1st deriv Mean center 0,829 0,396 3 0,776 0,400 2,03 0-20_NIR 76 37 x MSC Mean center 0,762 0,468 4 0,829 0,358 2,26 TL_VIS 75 36 x SG(2, 6) 0,837 0,513 9 0,823 0,486 1,97 TL_NIR 75 36 x SG (2,6) 0,855 0,485 7 0,845 0,374 2,55 TFNS_VIS 75 36 x MSC SG (2,6) Mean center 0,757 0,626 2 0,778 0,495 1,93 TFNS_NIR 75 36 x SG (2,6) 1st deriv Mean center 0,827 0,531 5 0,857 0,366 2,62 TFS_VIS 75 36 x SG (2.3) Mean center 0.723 0,670 5 0,639 0,688 1,39 TFS_NIR 75 36 x SG (2.3) 1st deriv Mean center 0,676 0,733 6 0,759 0,497 1,92 0-20_VIS: 0-20cm visible lab spectra model; 0-20_NIR: 0-20cm near-infrared lab spectra model; TL_VIS: topsoil lab visible spectra model; TL_NIR: topsoil lab near-infrared spectra model; TFNS_VIS: topsoil field visible spectra model without stones; TFNS_NIR: topsoil field near-infrared spectra model without stones; TFS_VIS: topsoil field visible spectra model with stones; TFS_NIR: topsoil field near-infrared spectra model with stones

All the models built using NIR spectra produced higher accuracies than the models created with VIS spectra, but required more latent factors, except for the topsoil laboratory case (TL_NIR), calibrated with less factors than topsoil laboratory VIS spectra (TL_VIS).

31

The results obtained using only a selection of the measured spectral region are very promising compared with the full spectral resolution models. In particular, the topsoil field NIR spectra model with stones (TFS_NIR) (figure 12) gave higher accuracies (RMSEV: 0,497) than TFS_NWAB (table 3). According to Chang and Liard, RPD classification all the models developed can accurately predict OC (table 3 and 4), except the topsoil field models (RPD<2). TFS_VIS is the only model which showed no prediction ability (RPD<1,40), while all the other models with RPD<2 can be improved. The best three out of four OC prediction models were produced using the NIR bands, except 0-20cm model developed using full resolution spectra without water absorption bands (0-20_NWAB) (table 3 and figure 11). Topsoil laboratory and field models with the best accuracies were built from NIR bands (table 4 and figures 12, 13, and 14). TFS_NIR (figure 14), with a RPD of 1,92, was the only topsoil field stone model which, with a little improvement, can be used for OC predictions, under field conditions, with good accuracy.

Figure 11: 0-20cm model, without water Figure 12: topsoil laboratory NIR model absorption bands, OC predicted vs. observed OC predicted vs. observed values values

Figure 13: topsoil field without stones NIR Figure 14: topsoil field with stones NIR OC predicted vs. observed values OC predicted vs. observed values

32

Table 5 shows the comparison between the OC observed values, analyzed using conventional laboratory methods and the validated predictions combining VIS, NIR, and VNIR regions of the spectrum with PLSR. All the prediction ranges have negative lowest extreme, but the range values are close to observed ones. The results obtained with VIS and NIR data, for both topsoil and laboratory condition, are as accurate as the full resolution ones, pointing out the feasibility of OC predictions with less data and faster analyses.

Table 5: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions in each of the visible (VIS), near- infrared (NIR), and combined visible–near-infrared (VNIR) regions of the EM spectrum

OC Observed VIS NIR VNIR NWAB VNIR

train test Mean SD Range% mean SD Range% mean SD Range% mean SD Range% mean SD Range% -0.29- -0.25- 0-20 76 37 1.28 0.81 0.22-3.60 1.33 0.84 3.25 1.26 0,81 3.9 1.30 0.86 0-3.45 1.30 0.86 0-3.35 -0.55- -0.23- -0.37- -0.38- TL 75 36 1.26 0.96 0.21-4.32 1.39 1.12 5.65 1.22 0.88 3.64 1.26 0.99 4.53 1.25 1.01 4.63 -0.07- -0.12- -0.20- -0.11- TFNOS 75 36 1.26 0.96 0.21-4.32 1.35 1.05 5.13 1.18 0.89 4.12 1.30 1.05 5.34 1.23 1.03 5.22

-0.43- -0.53- -0.40- -0.42- TFS 75 36 1.26 0.96 0.21-4.32 1.42 1.13 4.40 1.14 0.96 3.92 1.27 1.12 4.76 1.35 1.08 4.71 0-20: 0-20cm lab spectra model; TL: topsoil lab spectra model; TFNS: topsoil field spectra model without stones; TFS: topsoil field spectra model with stones

The results of the spectral analyses are in contrast with the variable of importance for projection (VIP) representation (figure 15 and 16) a b

Figure 15: variable importance for projection (VIP) representation related to topsoil laboratory (TL) organic carbon prediction model, considering the full spectral resolution (a), and excluding the water absorption bands (b)

The best topsoil laboratory (TL) and field stone (TFS) OC prediction models were both obtained using the NIR spectral ranges (table 3 and 4). The VIP data indicate that for TL 33 the most important 9 bands are all included in the WAB range (figure 15a); the exclusion of WAB (figure 15b) shows that the most important 6 bands are beyond 1970 nm but that the most significant band range are in the VIS (400-700 nm) and SWIR (2000-2400 nm). The exclusion of WAB did not produce significant decrease of accuracy for the OC prediction based on laboratory spectral data (see table 3).

a b

Figure 16: variable importance for projection (VIP) representation of topsoil field stone (TFS) organic carbon prediction model, considering the full spectral resolution (a) and excluding the water absorption bands (b)

TFS full spectral resolution model (figure 16a) shows high VIP values in the VIS spectral range, with a peak of importance around 1900 nm. The exclusion of WAB (figure 16b), indicated as the nine most important wavelengths are in the green area of VIS spectral range. Again the exclusion of the WAB for the prediction of OC based on field data didn’t result in significant difference of accuracy (table 3)

4.3.2 Iron Oxides Iron oxides (Fe) calibration models, using full resolution spectra, produced similar accuracies for laboratories and field spectral data, with a RMSECV always <1, but low R2CV (never more than 0,205) (table 6). The number of latent variables used to build Fe PLSR prediction models was between 3 and 7; models developed with laboratory data were built with fewer factors than the one produced with field data (table 6). There are no significant differences between laboratories and field models produced considering or excluding water absorption bands. Once validated, the models which gave the higher accuracies (lower RMSEV) were topsoil field, without stones and water absorption bands (TFNS_NWAB) (R2: 0.358), and topsoil

34 field with stones (TFS) (R2: 0.283); again, stoniness shows to be an important factor to include in the data analyses, due to its influence on the amount of light reflected by soil.

Table 6: calibration and validation of Fe oxides prediction models using full spectral data spectral data data manipulation Calibration Validation R to Log De- PLSR Training test (1/R) LS C noising Pretreatment R2CV RMSECV factors R2 RMSEV RPD 0-20 76 37 x MSC MC 0,151 0,871 4 0,285 0,647 1,19

0-20_NWAB 76 37 x MSC SG (2,10) MC 0,119 0,876 4 0,246 0,667 1,15 TL 75 36 x MSC SG(2;10) MC 0,188 0,882 4 0,211 0,633 1,08 TL_NWAB 75 36 x MSC MC 0,176 0,898 5 0,261 0,616 1,10 TFNS 75 36 x MSC MC 0,205 0,913 6 0,288 0,588 1,16 TFNS_NWAB 75 36 x MSC MC 0,185 0,901 6 0,358 0,582 1,23 TFS 75 36 x MC 0,138 0,912 6 0,283 0,587 1,16 TFS_NWAB 75 36 x MC 0,129 0,918 6 0,221 0,608 1,12 0-20: 0-20cm lab spectra model; (NWAB): NO water absorption bands; TL: topsoil lab spectra model; TFNS: topsoil field spectra model without stones; TFS: topsoil field spectra model with stones

The analyses of the separate VIS and NIR parts of the spectrum produced better Fe prediction models (table 7). However calibration models still show weak relations. As for OC prediction models (table 4), NIR spectral data produced more accurate models than the VIS data, except for the 0-20cm predictions, where VIS spectra gave higher R2 (0,460) and lower RMSEV (0,585) than the NIR spectra (table 7)

Table 7: calibration and validation of the Fe prediction models using visible and near-infrared spectral data spectral data data manipulation Calibration Validation R to Log PLSR training test (1/R) LSC De-noising Pretreatment R2CV RMSECV factors R2 RMSEV RPD 0-20_VIS 76 37 x SG(2,3) 0,065 0,935 7 0,460 0,585 1,31 0-20_NIR 76 37 x SG(2,3) 0,269 0,796 7 0,350 0,655 1,17 TL_VIS 75 36 x MSC SG(2,3) MC 0,017 0,978 3 0,125 0,630 1,08 TL_NIR 75 36 x MSC MC 0,206 0,880 3 0,391 0,541 1,26 TFNS_VIS 75 36 x MSC 0,007 1,025 6 0,237 0,634 1,13 TFNS_NIR 75 36 x MSC SG(2,10) MC 0,245 0,859 5 0,476 0,522 1,37 TFS_VIS 75 36 x MC 0,045 0,991 6 0,255 0,633 1,08 TFS_NIR 75 36 x MC 0,214 0,874 6 0,392 0,543 1,25 0-20_VIS: 0-20cm visible lab spectra model; 0-20_NIR: 0-20cm near-infrared lab spectra model; TL_VIS: topsoil lab visible spectra model; TL_NIR: topsoil lab near-infrared spectra model; TFNS_VIS: topsoil field visible spectra model without stones; TFNS_NIR: topsoil field near-infrared spectra model without stones; TFS_VIS: topsoil field visible spectra model with stones; TFS_NIR: topsoil field near-infrared spectra model with stones

.All the VIS calibration models resulted in a lower accuracy (RMSECV >0,9) than for NIR. Topsoil field data showed better results than topsoil laboratory, both for calibration and validation phases. 35

Figure 17: topsoil field stone NIR iron content predicted Vs observed values

Calibration and validation procedures showed as, for STB conditions, NIR part of the spectrum better explains the Fe content in the soil, than VIS and full spectrum (table 6 and 7). However, the RPD results obtained do not show Fe content prediction abilities. According to Chang and Laird (2002), the models produced falls in C category (RPD<1,4), which show the impossibility to improve it. Figure 17 shows the results of NIR topsoil field stone predicted versus observed values, indicating the inaccuracy of the model, combined with calibration and validation scarce statistics (R2: 0,392; RPD: 1,25). The comparisons between the Fe observed values, analyzed using conventional laboratory methods, and the validated predictions combining VIS, NIR, and VNIR regions of the spectrum, confirm that, as for OC (table 5), NIR part of spectrum produces the best results. VIS and NIR results show better prediction capacity than VNIR_NWAB and VNIR, which do not cover wide ranges (table 8) . VIS predictions gave better results for TFS than for TFNS, indicating a higher sensibility to stoniness, while NIR, VNIR_NWAB, and VNIR present highest Fe content extremes, far lower than the observed value.

Table 8: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions in each of the visible (VIS), near- infrared (NIR), and combined visible–near-infrared (VNIR) regions of the EM spectrum

Fe Observed VIS NIR VNIR NWAB VNIR

training test mean SD Range% mean SD Range% mean SD Range% mean SD Range% mean SD Range% 1.50- 2.07- 1.48- 1.72- 1.54- 0-20 76 37 2.76 0.77 5.00 2.93 0.50 4.36 2.92 0.63 4.19 2.84 0.45 3.91 2.84 0.44 3.79 1.48- 1.95- 1.28- 0.93- 1.24- TL 75 36 2.50 0.68 4.18 2.56 0.25 3.06 2.63 0.48 3.61 2.63 0.53 3.64 2.61 0.50 3.36 1.48- 1.83- 0.98- 1.17- 1.29- TFNS 75 36 2.52 0.72 4.18 2.67 0.26 3.24 2.57 0.59 3.66 2.60 0.54 3.78 2.58 0.51 3.39 1.48- 1.60- 1.23- 1.71- 1.58- TFS 75 36 2.52 0.72 4.18 2.65 0.55 4.66 2.46 0.57 3.63 2.57 0.44 3.32 2.55 0.50 3.47

36

0-20: 0-20cm lab spectra model; TL: topsoil lab spectra model; TFNS: topsoil field spectra model without stones; TFS: topsoil field spectra model with stones

Opposite situation when stoniness is not a factor, and NIR, VNIR_NWAB produce more accurate results. Overall, field data produced higher R2 and lower RMSEV than the laboratory data, for all the spectral band ranges considered (table 8).

Figure 18: B coefficients (VIP) of topsoil field Figure 19: spectral loadings and loadings stones iron oxides prediction model weights of topsoil field stones iron oxides prediction model

The representation of B coefficients (fig 18) shows that both VIS (475-556 nm) and NIR (793-1073; 793-1073 nm; 2011-2239 nm) band ranges influences the TFS iron oxides prediction model. The interpretation of spectral loadings and loadings weights (figure 19) confirms that NIR band range till 1900 nm (WAB) is the most influencing part of the spectrum for Fe prediction, based on field data.

4.3.3 Clay content All the Clay content prediction models results, for full spectra resolution, presented a very low accuracy in calibration phase (RMSECV>7) (table 9 and 10). The number of factors selected for the PLSR is not more than 6 for all the built models, but, once validated, the R2 is never more than 0,255 and the RMSEV is still sensibly high (always >5). All the models calibrated produced a R2CV lower than 0,07 and a RMSECV higher than 7. The number of factors included in the analyses was between 3 and 7. The validation phase gave TFNS_VIS as the best model with a R2 of 0,235 and a RMSEV of 5,187, but all the models didn’t produce enough accuracies (table 10).

37

Table 9: calibration and validation of clay content prediction models using full spectral data spectral data data manipulation Calibration Validation R to Log LS and De- PLSR training test (1/R) BL corr noising Pretreatment R2CV RMSECV factors R2 RMSEV RPD 0-20 76 37 x SG(2,3) MC 0.04 7.307 4 0.157 6,273 1,09 0-20_NWAB 76 37 x SG(2,10) MC 0.041 7,310 4 0.160 6,259 1,09 TL 75 36 x MSC SG(2;4) 0,106 7,175 4 0,215 5,186 1,14 SG (2; TL_NWAB 75 36 x MSC 10) 0.097 7,466 4 0,180 5,385 1,10 TFNS 75 36 x MSC MC 0.02 7.739 6 0.132 5.523 1,07 SG (2; TFNS_NWAB 75 36 x MSC 20) 0,003 7,486 4 0,250 5,133 1,16 TFS_NWAB 75 36 x MSC SG (2; 6) 0,009 7,499 6 0,119 6,378 0,95 0-20: 0-20cm lab spectra model; (NWAB): NO water absorption bands; TLNS: topsoil lab spectra model without stones; TFNS: topsoil field spectra model without stones; TFS: topsoil field spectra model with stones

Both laboratory and field spectral data generated insufficient results for the clay content prediction (table 9).The analyses of VIS and NIR spectral ranges didn’t produce better clay prediction models. The negative results of the prediction models are proved by the RPD values, which reached its peak at 1,20 with topsoil laboratory VIS spectra (table 10). The comparisons between the clay observed values, analyzed using conventional laboratory methods, and the validated predictions combining VIS, NIR, and VNIR regions of the spectrum with PLSR (table 11), confirmed the inefficiency of this technique, at this conditions.

Table 10: calibration and validation of clay prediction models using visible and near-infrared spectral data spectral data data manipulation Calibration Validation De- PLSR training test R to Log (1/R) LSC noising Pretreat R2CV RMSECV factors R2 RMSEV RPD 0-20_VIS 76 37 x SG(2,6) 0,003 7,340 3 0,030 6,690 1,02 0-20_NIR 76 37 x SG(2,6) MC 0,051 7,191 3 0,125 6,377 1,07 TL_VIS 75 36 x MC 0,058 7,075 4 0,125 5,530 1,20 TL_NIR 75 36 x MSC MC 0,062 7,335 6 0,193 5,346 1,11 TFNS_VIS 75 36 x MC 0,053 7,206 5 0,235 5,187 1,14 TFNS_NIR 75 36 x MC 0,063 7,269 7 0,109 6,159 0,96 TFS_VIS 75 36 x SG(2,3) MC 0,005 7,457 5 0,054 5737 1,03 TFS_NIR 75 36 x 0,041 7,451 7 0,010 6,586 0,90 0-20_VIS: 0-20cm visible lab spectra model; 0-20_NIR: 0-20cm near-infrared lab spectra model; TL_VIS: topsoil lab visible spectra model; TL_NIR: topsoil lab near-infrared spectra model; TFNS_VIS: topsoil field visible spectra model without stones; TFNS_NIR: topsoil field near-infrared spectra model without stones; TFS_VIS: topsoil field visible spectra model with stones; TFS_NIR: topsoil field near-infrared spectra model with stones

38

The clay ranges obtained indicate as all the models produced, considering all the spectral ranges, do not predict with sufficient level of accuracy. VNIR and VNIR_NWAB show for the TFS a low extreme of -7,67 and -4,02, respectively. Neither VIS nor NIR band ranges, which produced very good results for OC (table 4 and 5) , and improvements for Fe content predictions (table 7), show better ranges prediction.

Table 11: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions in each of the visible (VIS), near- infrared (NIR), and combined visible–near-infrared (VNIR) regions of the EM spectrum

Clay Observed VIS NIR VNIR NWAB VNIR

training test mean SD Range% mean SD Range% mean SD Range% mean SD Range% mean SD Range% 4.95- 8.26- 7.74- 6.89- 6.34- 0-20 76 37 16.78 6.83 39.07 16.40 3.94 25.36 16.84 3.41 25.20 16.31 3.65 23.05 16.21 3.59 24.25 1.90- 5.70- 5.66- 7.24- 5.47- TL 75 36 12.36 5.93 26.92 12.77 2.82 19.88 12.69 3.57 20.08 12.10 4.40 16.02 12.09 2.80 17.50 1.90- 5.58- 4.68- 6.07- 5.58- TFNOS 75 36 12.28 5.93 26.92 12.46 3.75 19.95 12.51 4.71 32.50 12.17 2.12 15.72 12.08 2.28 15.84 1.90- 7.57- 6.52- -4.02- -7.67- TFS 75 36 12.36 5.93 26.92 12.85 2.18 16.15 12.47 3.71 22.28 11.41 3.44 16.81 12.54 4.37 19.26 0-20: 0-20cm lab spectra model; TL: topsoil lab spectra model; TFNS: topsoil field spectra model without stones; TFS: topsoil field spectra model with stones

4.3.4 EnMAP simulations The results of OC prediction models using EnMAP spectral resampling and noise simulation are presented in table 12.

Table 12: calibration and validation of the ASD OC prediction models and EnMAP resampled spectral data EnMAP data manipulation Calibration Validation R to trainin Log De- PLSR g test (1/R) noising different Pretreatment R2CV RMSECV factors R2 RMSEV RPD

TL-Res 76 37 x SG(2,6) 1st deriv 0,880 0,440 6 0,864 0,381 2,51

TL-Res_SNR 76 37 x SG(2,6) 1st deriv 0,815 0,547 4 0,824 0,469 2,04

TL_ASD 75 36 x SG(2, 6) 1st deriv Mean center 0.866 0.464 4 0.83 0.405 2,36

TFNS_Res 75 36 x SG(2,3) Mean centre 0,011 1,338 6 0,243 0,965 0,99 TFNS_Res_SN R 75 36 x SG(2,3) Mean centre 0,013 1,401 6 0,252 0,988 0,97

TFNS_NIR_ASD 75 36 x SG (2,6) 1st deriv Mean center 0,827 0,531 5 0,857 0,366 2,62

TFS-Res 75 36 x SG(2,6) 1st deriv 0,771 0,576 8 0,755 0,583 2,01

TFS-Res_SNR 75 36 x SG(2,6) Mean centre 0,772 0,570 5 0,633 0,712 1,64

TFS_NIR_ASD 75 36 x SG (2.3) 1st deriv Mean center 0,676 0,733 6 0,759 0,497 1,92 TL-Res: topsoil lab resampled spectra model; TL-Res_SNR: topsoil lab resampled spectra model with noise simulation; TL_ASD: best topsoil lab fieldspectra model; TFNS_Res: topsoil field resampled spectra model without stones; TFNS_Res_SNR: topsoil field resampled spectra model, with noise simulation, without stones; TFNS_NIR_ASD: best topsoil ASD field spectra model without stone; TFS_Res: topsoil field resampled spectra model with stones; TFS_Res_SNR: topsoil field

39 resampled spectra model, with noise simulation, with stones; TFNS_NIR_ASD: best topsoil ASD field spectra model without stone The two topsoil laboratory models and present satisfactory results, with an expected difference between the first (TL-Res), obtained just resampling laboratory spectra, and the second (TL-Res_SNR), calculated introducing the noise component. In calibration and validation phases, TL-Res shows higher accuracy (RMSECV: 0,440; RMSEV: 0,381) than TL-Res_SNR, with the same data manipulation (table 12). The best topsoil OC prediction model obtained with laboratory spectra (TL_ASD) shows a RPD (2,36), slightly higher than TL-Res_SNR (2,04), but lower than TL-Res (2,51), indicating that the model loses accuracy only when the noise component is included in the simulation (table 12). TL-Res_SNR satisfactory prediction ability is proved by predicted versus observed OC validated values (figure 20), with no outliers and closeness of the points to the trend line; however, the model needs to be improved due to the negative (below zero) predicted values.

Figure 20: EnMAP topsoil laboratory OC Figure 21: B coefficients of EnMAP topsoil predicted Vs observed values laboratory OC predicted Vs observed values

The results of the B coefficients (figure 21), indicate that there is not a band range exclusively responsible for the OC predictions, but there are several peaks, corresponding to different EnMAP band channels, both in the VIS (588nm) and NIR (2299nm) spectral wavelengths. The resampling procedure didn’t produce the same results for topsoil field data without stones; the model built including the noise simulation (TFNS_Res_SNR), once validated, gave higher accuracy (RMSEV:0,252) than the same without noise (TFNS_Res). In this case the best original model built with spectra collected in the field (TFNS_ASD) presents

40 a RPD (2,62) much higher than TFNS_Res (0,99) and TFNS_Res_SNR (0,97); both the last two models do not show significant prediction abilities. TFNS_Res_SNR inaccuracy is proved by the representation of the OC predicted versus observed values (figure 22); this model does not generate acceptable predictions, as shown by the distance of the points from the trend line, the negative predicted values, and the outliers. The analyses of the significant spectral wavelengths for TFNS_Res_SNR indicate a peak of spectral loading at 630nm, and various peaks in the NIR spectral range (1352, and 2351 nm) (figure 23).

Figure 22: EnMAP topsoil field, without Figure 23: spectral loadings and weights of stones, OC predicted Vs observed values EnMAP topsoil field OC predicted Vs observed values

The resampling of topsoil field stone specta (TFS_Res) gave lower accuracy (RMSECV: 0,576) in calibration phase than the same built adding noise simulation (TFS_Res_SNR); the latter one was also built using 5 latent factors, but as expected, after validation, TFS_Res_SNR gave a lower R2 of 0,634 than TFS_Res, which presented a higher accuracy (table 12). The promising results of calibration and validation are confirmed by the RPD values; TFS_Res gave higher RPD (2,01) than the best model produced with field stones data (TFS_NIR_ASD) (1,92), while TFS_Res_SNR, with a RPD of 1,62, can be possibly improved (Chang and Laird, 2002). The potentialities of TFS_Res_SNR are shown in the validated results of predicted versus observed OC values (figure 24). Most of the points are close to the trend line; however, there are negative predicted values and outliers. B coefficients representation of TFS_Res_SNR (figure 25) clearly indicates the VIS and NIR (432-488, 606-843, 1699-1870, 1982-2236, and 2299-2395 nm) spectral ranges with positive slopes in the PLSR model.

41

Figure 24: EnMAP topsoil field stones Figure 25: B coefficients of EnMAP topsoil OC predicted Vs observed values Field stone OC predicted Vs observed values

As detected with the models produced with ASD laboratory and field spectral data (table 3 and 4), OC predicted with resampled and noise simulated EnMAP data present ranges with lowest values below zero (table 13).

Table 13: Statistical description of the observed soil OC, analyzed using conventional laboratory methods of analyses, and their validated PLSR predictions using EnMAP resampled and noise simulated spectra EnMAP Observed Predicted training test mean SD Range% mean SD Range% TL-Res 75 36 1.26 0.96 0.21-4.32 1.25 1.05 -0.4-4.82 TL-Res-SNR 75 36 1.26 0.96 0.21-4.32 1.34 1.11 -0.43-5.07 TFNS-Res 75 36 1.26 0.96 0.21-4.32 1.54 0.91 -0.03-3.80 TFNOS-Res-SNR 75 36 1.26 0.96 0.21-4.32 1.53 0.98 -0.19-4.03 TFS-Res 74 36 1.33 1.17 0.21-5.92 1.25 1.10 -0.58-4.29 TFS-Res-SNR 74 36 1.33 1.17 0.21-5.92 1.33 1.08 -0.42-4.25 TL-Res: topsoil lab resampled spectra model; TL-Res_SNR: topsoil lab resampled spectra modelwith noise simulation; TFNS_RES: topsoil field resampled spectra model without stones; TFNS_RES_SNR: topsoil field resampled spectra model, with noise simulation, without stones; TFS_Res: topsoil field resampled spectra model with stones; TFS_Res_SNR: topsoil field resampled spectra model, with noise simulation, with stones

Overall, laboratory resampled data offer a more accurate OC ranges, but the up-scaling simulation of field data, including stoniness effect, offer important results to work on in prevision of future airborne and space-borne data collection campaigns.

42

5. Discussion Chemical and spectral analyses During the fieldwork, a soil classification was not realized, but the results of the chemical analyses revealed wide ranges of organic carbon (OC), iron oxides (Fe) and clay content (see table 1 and paragraph 4.1). The Pearson’s correlation coefficient between OC and Fe topsoil and 0-20cm shows that for these properties topsoil could be representative for the deeper layer while topsoil and 0-20cm clay are not correlated (see table 2). Land use could be the reason behind these trends. STB was characterized by goat farming and from 1970’s, with the private property increase, also by game farming. The not intense activity undergone by the soil system caused a relatively OC and Fe concentrations till 20 cm depth, but water percolation, increased by vegetation and water holding capacity reduction, mobilized clay along the soil profile. Mills et al. (2003) reported that in thicket biome vegetation removal produced important carbon losses estimated in approximately 4.0 kg/m2 in soils to a depth of 500 mm and 4.5 kg/m2 in biomass (above- and belowground). Our research study confirmed this trend evidencing, for example, much higher topsoil and 0-20 cm OC for plot 6 (pristine vegetation) than plot 5 (almost bare soil) (see 4,1 and appendix II and III). Moreover the analyses of plots 51, 52, and 53 followed the same rule, with the former characterized by lower vegetation cover (visual estimation) and soil OC (appendixes II and III) than the latter two. The differences in terms of soil chemical contents among plot 51, 52 and 53, detected with the chemical analyses (see appendixes II and III), were also pointed out by the analyses of the single spectra (fig. 9), confirming the possibility to detect SOC content as an inverse function of soil spectral reflectance (Kooistra el al., 2003). The results of the spectral measurements realized in the laboratory and in the field offered points of discussion in terms of mean reflectance detected by the instrument (fig. 8): • As expected (Kooistra et al., 2003; Stevens et al., 2008), although the use of a contact probe, laboratory reflectance was higher than field reflectance, except for tfnos which gave an extraordinary average reflectance value. • Topsoil laboratory reflectance showed higher reflectance values than 0-20cm only when calculated without stones, while it gave lower reflectances values

43

when calculated including stoniness effect. Topsoil field reflectance without stones was in average much higher than the same one measured considering stoniness. We can affirm that stoniness reduce the amount of light reflected by soil, and again this condition needs to be considered for soil HRS application over the STB. • Topsoil reflectance calculated under canopy was lower than the one calculated in an open field, due to the much higher SOC content under vegetation cover than in the bare soil. Using a contact probe all shading effects by vegetation could be excluded. Thus the opportunity to observe that soils reflected less light when they are less degraded is given. The observed differences indicate the huge importance to measure reflectance under field conditions in order to individuate all the possible variables at the moment of building prediction models. The airborne and spaceborn up-scaling procedures find their challenge and definitive actuality when field conditions are represented. The up-scaling simulation to EnMAP, characterized by resampling to the 233 sensor channels and the highest possible noise level, evidenced the possible alteration of original ASD field and laboratory spectra (figure 10), providing good predictions for SOC. Other issues, linked with atmospheric attenuation, pixel size, vegetation effect, and soil roughness, need to be simulated to better estimate the EnMAP potentialities for soil properties prediction.

Models calibration and validation In the past the three soil properties investigated in this research study, OC, Fe, and clay, were mostly predicted in the laboratory, combining VNIRS and PLSR (Kooistra et al 2001; McCarty et al., 2002; Chang and Laird, 2002; Viscarra-Rossel et al., 2003; Cozzolino and Moron, 2003); however, there is very few literature about the potentialities of field spectroscopy as a tool to predict soil properties (Kooistra et al., 2003; Waiser et al., 2007; Gomez et al., 2008; Stevens et al., 2008). The general overview of the results (see 4.3) indicated that OC models are reliable and have prediction abilities. Both for calibration and validation phases laboratory OC

44

prediction models gave better results than field models (see tables 3 and 4). Topsoil field models, gave always for both calibration and validation, R2> 0.7. Although the use of a contact probe, topsoil field models with stones presented lower accuracies than the same without stones. Not significant differences have been appreciated between models created with or without water absorption bands, indicating that soil moisture didn’t influence significantly the spectral measurements. This is a result in contrast with other studies, where soil moisture was indicated as factor causing significant variations in the field spectra and consequently influencing models accuracy (Kooistra et al., 2003; Chang et al., 2005). The perspective of HRS campaigns in the STB is more promising based on the scarce influence of soil moisture over the soil reflectance, in this period of the year (October-November-December). The separate analyses of VIS and NIR parts of the spectra indicate that, once validated, the OC models produced with NIR wavelength have higher accuracies than VIS models, in contrast with what observed by Viscarra-Rossel et al (2006), that didn’t find significant differences. In particular, TFNS_NIR and TFS_NIR have a R2 of 0,880 and 0,759 and RMSEV of 0,336 and 0.497 respectively, which are the best OC prediction models for topsoil field data (table 4). The evaluation of the most important spectral bands on response of soil reflectance to OC were individuated in the VIS range, in contrast both with the quality of above mentioned results and with what observed by Henderson et al. (1992), who discovered in the NIR and MIR the 12 most representative optimal bands for OC characterization. Although the predictive differences between laboratory and field data exist (table 5), it was shown that field spectroscopy can be used with sufficient accuracy for the rapid OC characterization. Different evaluation for Fe prediction models, whose calibration gave R2CV lower than 0.3 and RMSECV<1 (tables 6 and 7). The number of latent factors used for Fe predictions was meanly lower than for OC predictions, but the validation phase didn’t produce same good results, with RPD values always lower than 1,40. Field spectral models gave better Fe predictions than laboratory models, which indicates that field contact probe minimizes the differences between laboratory and field spectrometry. No significant differences were noticed with the inclusion of water

45

absorption bands in the full resolution spectra, while, in general, NIR data produced better results than VIS and full resolution models (table 8), in line with the trend defined by Richter et al. (2009) that predicted Fe in Mediterranean based on NIR centring wavelength (890nm). TFNS_NIR and TFS_NIR are the best prediction models for both calibration and validation phases, with the former one more accurate that the latter due to the negative effect of stones over the soil reflectance properties. The Fe prediction models didn’t gave higher accuracies, but content means and ranges predicted (table 8) promised to produce better results, whether more specific analyses and a deep evaluation of error components will be done. Clay prediction models gave insufficient results. All the models produced analyzing separately VIS, NIR and VNIR, with and without water absorption bands, gave for calibration phase a R2 lower than 0.1, with a RMSECV always higher than 7 (table 9 and 10); the validation phase gave better results, but the accuracies of the model gave always a RMSEV higher than 5. All the models gave a RPD< 1,20. The reasons behind these unsatisfactory results could be several, including also the impossibility to predict clay content with field spectroscopy in this biome, or mistakes in the chemical analyses used for models calibration (the spectral data were the same used for OC prediction models). It is definitely assumable that this research study didn’t produce a strong or reliable base for clay prediction models, suggesting it will be necessary to repeat the experiment, especially in prevision of hyperspectral campaigns.

EnMAP simulations The simulation of EnMAP up-scaling was realized only for topsoil OC laboratory and field spectroscopy models, because it was the only property which was predicted with high level of accuracy with ground data, and also the interest expressed by STRP is mostly focusing on the prediction of OC sequestration potentialities over the biggest possible scale. The only two models which gave not sufficient accuracy were TFNS_Res and TFNS_Res_SNR (table 13) with low R2 both for calibration and validation phases. Laboratory stones topsoil models resulted in a high level of accuracy. Topsoil laboratory

46

model with the noise simulation (TL_Res_SNR) didn’t differ significantly in accuracy from the same one without noise simulation (TL_Res). The two topsoil field models present a promising scenario in relation two future up- scaling procedures. TFS_Res gave good accuracies levels for calibration (RMSECV: 0,576) and validation (RMSEV: 0,583); the same one obtained including noise simulation (TFS_Res_SNR) showed higher accuracy for calibration (RMSECV: 0,566) but lower for validation (RMSEV: 0,714). Gomez et al (2008) presented the results of an up-scaling simulation of field spectral data to Hyperion spectral resolution obtaining good predictions and observing that the spectral resolution didn’t change the accuracy of the model. However, the following SOC predictions based on real Hyperion hyperspectral data was less accurate than the resampled field spectra, due to sensor noise and Hyperion spatial resolution (30m). The EnMAP up-scaling simulation realized for this research study attempted to recreate the worst possible reality, simulating the lowest SNR (100) for all the EnMAP spectral channels, but other important issues influencing soil properties prediction quality, like vegetation, soil crust, soil surface roughness, and pixel size were not taken in consideration. For example, although the severe degradation, there are several areas of STB characterized by intact vegetation. EnMAP spatial resolution (30m) would lead to study mixed surface, and a lot of pixels could not be used for soil properties prediction. Moreover, as showed in this research, STB soils are covered by small little stones, definitely affecting the amount of light reflected and detected by the instruments. Furthermore, it has to be added that STB surface is not always flat, smooth, or homogenous and therefore sample preparation (as is done in the lab) is almost impossible. This leads to problems such as variations in particle size, adjacency, Bidirectional Reflectance Distribution Function (BRDF) effects, and to the developing of methodologies that well represent a pixel on the ground and in the EnMAP sensor from both the chemical and spectral perspectives (Ben-Dor, 2008).

Future perspectives In December 1997 more than 150 countries signed the Kyoto Protocol to the United Nations Framework Convention on Climate Change, which is a treaty aimed at cutting

47

emissions of the main greenhouse gases. Because CO2 is the main contributor to

radiative forcing, mechanisms for either reducing CO2 emissions or capturing

atmospheric CO2 in vegetation or soils have been highlighted as the most practical short term measures for reducing the present rise in atmospheric CO2 concentration (Mills et al., 2003). This favored the creation of an international market for carbon, whereby carbon emitters are paying for the creation of carbon sinks. The high level of degradation of STB (Figure 26) could represent an important possibility to develop alternative strategies of income, like the participation in the carbon market.

Figure 26: fence effect denoting severe land degradation in the STB

Mills and Cowling (2006) calculated average rate of 0.42 ± 0.08 kg C m-2 yr-1 over 27 years in Krompoort, location in the core of STB semiarid region, after P. afra replanting. Whittaker and Niering (1975) report a rate of net primary production of 0.07–0.1 kg C m- 2 yr-1 for shrublands in Arizona that occur in a similar climate to STB. The great potential of STB as carbon sink is confirmed by the high productivity of P. afra and its tolerance to drought, shifting from a C3 to a CAM photosynthetic mode in response to water and NaCl stress (Ting & Hanscom, 1977), increasing daylength, and increasing

48

temperature, irrespective of moisture status (Guralnick et al., 1984). The estimations of carbon sequestration above mentioned are referred to local scale and have been acquired after long experiment and expensive chemical analyses. HRS can play a resolute role, giving the possibility to estimate SOC sequestration potentialities over a much larger scale and avoiding the old traditional methods for the estimation of soil properties with good levels of accuracy (Gomez et al, 2008). According to Chang and Laird (2002), we can state that ground-based spectroscopy (laboratory and field ) can be a reliable technique to measure SOC content; Stevens et al. (2008) found that remote sensing, for agriculture fields, might be a practical way to spatially evaluate soil carbon on large scales, but still shows unacceptable level of accuracy, probably due to the land use (ploughing causes too high mobilization of SOC along the profile). Inversely, STB could provide a perfect environment for the OC prediction, based on the “constant” land use along time and space, the favorable topography of the biome, and the scarce vegetation cover . STRP is already taking place, with the participation of several landowners, but still the number of people which participate in the restoration program is quite low, in relation to the 800000 ha of deforested thicket. The possibility to estimate with enough accuracy the amount of carbon sequestered and, consequently, to give a precise indication of the possible income deriving from the conversion of the land from farming to restoration, would increase the total participation to the project. This research study, based on relatively small working scale (transect), is creating the conditions for the up-scaling process which will enable the quantification of SOC and other soil properties over all the STB of South Africa. The add-values of the prediction models developed for this research are their generality and spatial applicability, due to the homogeneity of the described conditions valid for all the STB. The very huge scale of the STRP imposes the necessity to proceed with sound forethought and planning (Cowling et al., 2006), and remote sensing offers the opportunity to stakeholders to build valuable and strong baseline as a reference point for decision-making.

49

6. Conclusions and recommendations This research study examines the possibility to predict topsoil and 0-20 cm layers organic carbon, iron oxides and clay content combining field spectroscopy and PLSR techniques, in the subtropical thicket biome (STB) of Eastern cape province of South Africa, with the intention to develop a cost effective methods to rapidly monitor soil properties content. The results of the chemical analyses indicate that topsoil and 0-20 cm OC and Fe are correlated (Paerson´s correlation coefficient= 0.77 and 0.68, respectively) while for clay the correlation is very low. The mean soil spectra calculated demonstrated there are differences of reflectance values between soils measured with and without stones, indicating as stoniness should be considered as an important factor, especially in relation to the up-scaling process to airborne and space-borne hyperspectral data. Soil moisture content of field did not decrease prediction accuracies for organic carbon, iron oxides and clay content. The PLSR models developed with laboratory and field spectra offered very good results for the prediction of OC (Calibration and validation R2> 0.7, RMSEV<0.6), and insufficient level of accuracy for Fe (RPD always<1,40), and clay (RPD<1,20). The analyses of the separate band ranges showed that NIR wavelengths produced higher accuracy than VIS and full spectral resolution for the prediction of 0-20 cm and topsoil OC and Fe, but didn’t cause any difference for the prediction of clay. The simulation of EnMAP up-scaling, realized only for topsoil OC laboratory and field spectroscopy models, and including the highest possible noise (SNR; 100) along all the sensor band channels, gave predictions with a good accuracy (average RPD about 2) and promising results for the field spectra models (R2: 0,634). The obtained results indicate that, for the subtropical thicket biome, combining soil spectroscopy and PLSR, does favour an accurate prediction of OC, while further investigation need to be realized to improve iron oxides and clay content prediction models.

For the following studies it is recommendable to:

50

• Create a soil type database to include and to refer to for the selection of the study area; • Increase the number or visited points and enlarge the area of interest in order to include as much biomes variability as possible; • According to instruments and time availability, realize all the soil samples and field spectral data collections at the same moment; • Improve Fe oxides prediction models; • Individuate the reasons behind the unsuccessful clay predictions and, based on these elements, reconstruct the models, using also alternatives to PLSR; • Focus on more soil properties to test VNIRS capability to multiple-predict in STB; • Realize an airborne HRS data collection and upscale the models built in this research study, evaluating vegetation, soil crust, stoniness, and pixel size effects on the soil properties prediction capability; • Persist in the use of a contact probe for soil properties prediction with field spectroscopy, as it eliminates vegetation influence on the spectral reflectance and increase the SNR;

51

7. References

• Acocks JPH, 1953. Veld types of South Africa. Memoirs of the Botanical Survey of South Africa 28: 1–128

• Agbnenin, A., 2003. Extractable Iron and Aluminum Effects on Phosphate Sorption in a Savanna Alfisol. Soil Science Society of America Journal 67:589-595

• Aucamp A.J., 1976. The role of the browser in the bushveld of the Eastern Cape. Proceedings of the Grassland Society of South Africa 11:135-138.

• Baize, D., Jabiol, B., 1995. Guide pour la description des sols. Paris, INRA edition

• Barnes R.J., Dhanoa M.S., Lister S.J., 1989. Applied Spectroscopy 43- 772.

• Bartholomeus H., 2007. Determining iron content in Mediterranean soils in partly vegetated areas, using spectral reflectance and imaging spectroscopy. International Journal of Applied Earth Observation and Geo-Information 9, 194–203

• Ben-Dor, E., Banin, A., 1995. Near-infrared analysis as a rapid method to simultaneously evaluate several soil properties. Soil Science Society of America Journal 59, 364– 372

• Ben-Dor, E., Patkin, K., Banin, A., Karnieli, A., 2002. Mapping of several soil properties using DAIS-7915 hyperspectral scanner data—a case study over clayey soils in Israel. International Journal of Remote Sensing 23, 1043–1062

• Ben-Dor E., Taylor R. G., Hill J., Dematte J. A. M. ˆ, Whiting M. L., Chabrillat S., S. Sommer. 2008. Imaging Spectrometry for Soil Applications. Chapter 8, PP.321-324

• Bengera I., K. H. Norris. 1968a. Influence of fat concentration on absorption spectrum of milk in near-infrared region. Israel J. Agr. Res. 18 (3): 117.

• Bengera I., K. H. Norris. 1968b. Determination of moisture content in soybeans by direct spectrophotometry. Israel J. Agr. Res. 18 (3): 125

• Brereton R.G., 1990. Chemometrics: applications of mathematics and statistics tolaboratory systems. Ellis Horwood, London, UK.

52

• Buol S. W., Hole F. D., and McCracken R. J., 1973. Soil Genesis and Classification, p. 360. The Iowa State University Press, Ames

• Chang, C.-W., Laird, D.A., 2002. Near-infrared reflectance spectroscopic analysis of soil C and N. Soil Science 167 (2), 110 – 116

• Clark R.N, 1999. Spectroscopy of rocks and minerals, and principles of spectroscopy. p. 3-52. In A.N. Rencz (ed.) Remote sensing for the earth sciences: Manual of remote sensing. John Wiley & Sons, New York

• Cocks M.L., Wiersum K.F., 2003. The significance of plant diversity to rural households in eastern Cape province of South Africa. Forests, Trees and Livelihoods 13:39-58.

• Condit H. R., 1972. Application of characteristic vector analysis to the spectral energy distribution of daylight and the spectral reflectance of American soils. App. Opt. 11, 74–86

• Cozzolino, D., Moron, A., 2003. The potential of near-infrared reflectance spectroscopy to analyse soil chemical and physical characteristics. Journal of Agricultural Sciences 140, 65– 71

• Cozzolino D., Moron A., 2006. Potential of near-infrared reflectance spectroscopy and chemometrics to predict soil organic carbon fractions. Soil & Tillage Research 85: 78–85

• Cowling R.M., Kirkwood D., Midgley J.J., Pierce S.M., 1997. Invasion and persistence of bird-dispersed, subtropical thicket and forest species in fire- prone fynbos. Journal of Vegetation Science 8, 475–488

• Cowling R.M., 1983. Phytochorology and vegetation history in the south- eastern Cape, South Africa. Journal of Biogeography 10, 393–419

• Cowling R.M., Procheş Ş., Vlok J.H.J., 2004. On the origin of southern African subtropical thicket vegetation. South African Journal of Botany, 71(1): 1–23

• Cowling R.M., Pierce S.M., Sigwela A., 2006. Mainstreaming restoration: a conceptual and operational framework. In: ARONSON, J. MILTON, S. & BLIGNAUT, J. (eds.). Restoring Natural Capital – Science, Business, and Practice

• Daniel, K.W., Tripathi, N.K., Honda, K., 2003. Artificial neural network analysis of laboratory and in situ spectra for the estimation of

53

macronutrients in soils of Lop Buri (Thailand). Aust. J. Soil Res. 41, 47– 59.

• Duiker S.W., Rhoton F.E., Torrent J., Smeck N.E., Lal R., 2003. Iron (Hydr)Oxide Crystallinity Effects on Soil Aggregation. Soil Sci. Soc. Am. J. 67:606–611

• Dunn, B .W., H. G. Beecher, G. D. Batten, and S. Ciavarella. 2002. The potential of near-infrared reflectance spectroscopy for soil analysis – a case study from the Riverine Plain of south-eastern Australia. Aust. J. Exp. Agric. 42:607-614

• Ellert, B.H., H.H. Janzen, and T. Entz. 2002. Assessment of a method to measure temporal change in soil carbon storage. Soil Sci. Soc. Am. J. 66:1687-1695.

• Gaffey, S.J., L.A. McFadden, D. Nash, and C.M. Pieters. 1993. Ultraviolet, visible, and near-infrared reflectance spectroscopy: Laboratory spectra of geologic materials. p. 43-73.

• Geladi, P. and B.R. Kowalski, 1986. Partial least-squares regression: A tutorial. Anal. Chim. Acta. 185:1-17.

• Goetz, A. F. H., and Wellman, J. B. (1984). Airborne Imaging Spectrometer: A new tool for remote sensing. IEEE Trans. Geosci. Remote Sens. 22, 546–550

• Guggenheim, S., Martin, R. T.; 1995, "Definition of clay and clay mineral: Journal report of the AIPEA nomenclature and CMS nomenclature committees". Clays and Clay Minerals 43: 255–256

• Guralnick, L. J., P. A. Rorabaugh, and Z. Hanscom. 1984a. Influence of photoperiod and leaf age on Crassulacean Acid Metabolism in Portulacaria afra. Jacq. Plant Physiology 75:454–457

• Henderson T.L., Baumgardner M.F., Franzmeier D.P., Stott D.E., and Coster D.C., 1992. High dimensional reflectance analysis of soil organic matter. Soil Sci. Soc. Am. J. 56:865-872.

• Hunt, G.R., Salisbury, J.W., 1970. Visible and near infrared spectra of minerals and rocks. I. Silicate minerals. Modern Geol. 1, 283–300

• Janik, L. J., R. H. Merry, and J. O. Skjemstad, 1998. Can mid infrared diffuse reflectance analysis replace soil extractions? Aust. J. Exp. Agric. 38:681-696

54

• Kerley, G.I.H., M.H. Knight, and M. De Kock, 1995. Desertification of subtropical thicket in the Eastern Cape, South Africa: are there alternatives? Environmental Monitoring and Assessment 37:211-230

• Kerley, G.I.K., A.F. Boshoff, and M.H. Knight, 1999. Ecosystem integrity and sustainable land-use in the Thicket Biome, South Africa. Ecosystem Health 5:104-109.

• Kerley, G.I.H., A.F. Boshoff, and M.H. Knight; 2002. The Greater Addo National Park, South Africa: Biodiversity conservation as the basis for a healthy ecosystem and human development opportunities. In Managing for Healthy Ecosystems, eds. D.J. Rapport, W.L. Lasley, D.E. Rolston, N.O. Nielsen, C.O. Qualset, and A.B. Damania, pp. 359-374, CRC Press, Boca Raton.

• Kooistra, L.,Wehrens, R., Leuven, R.S.E.W., Buydens, L.M.C., 2001. Possibilities of visiblenear infrared spectroscopy for the assessment of soil contamination in river floodplains. Anal. Chim. Acta 446, 97–105.

• Kooistra, L., Wanders, J., Epema, G.F., Leuven, R.S.E.W., Wehrens, R., Buydens, L.M.C. 2003. The potential of field spectroscopy for the assessment of sediment properties in river floodplains. Anal. Chim. Acta 484, 189–200.

• Lee, S.W., J. F. Sanchez, R.S. Mylavarapu, and J. S. Choe. 2003. Estimating chemical properties of Florida soils using spectral reflectance. Trans. ASAE. 46:1443-1453.

• Li, B., Morris, J., Martin, E.B., 2002. Model selection for partial least squares regression. Chemometrics and intelligent laboratory systems. 64: 79-89

• Lloyd, J. W., Van den Berg E. C., and Palmer A. R.; 2002. Patterns of transformation and degradation in the Thicket Biome, South Africa. TERU Report 39, University of Port Elizabeth, Port Elizabeth, South Africa.

• Low AB, Rebelo AG, 1996. Vegetation of South Africa, Lesotho and Swaziland. Department of Environmental Affairs and Tourism, Pretoria, South Africa

• McCarty, G.W., Reeves III, J.B., Reeves, V.B., Follett, R.F., Kimble, J.M., 2002. Mid-infrared and near-infrared diffuse reflectance spectroscopy for soil carbon measurement. Soil Sci. Soc. Am. J. 66, 640–646

• McFadden and Hendricks, 1985. L.D. McFadden and D.M. Hendricks, Changes in the content and composition of pedogenic iron oxyhydroxides

55

in a chronosequence of soils in southern California. Quat. Res. 23 (1985), pp. 189–204

• Midgley, J.J. 1991. Valley Bushveld dynamics and tree euphorbias. In Proceedings of the first Valley Bushveld/Subtropical Thicket Symposium, ed. P.J.K Zacharias, G.C. Stuart-Hill, and J.J. Midgley, 8-9. Special Publication, Grassland Society of Southern Africa

• Mills, A.J., O'Connor T.G., Bosenberg D.W., Donaldson J., Lechmere- Oertel R.G., Fey M.V., and Sigwela A.. 2003. Farming for carbon credits: implications for land use decisions in South African rangelands. 26th July . 1st August 2003, Durban, South Africa

• Mills, A.J., Fey M.V., 2004. Soil carbon and nitrogen in five contrasting biomes of South Africa exposed to different land uses. South African Journal of Plant and Soil 21:81-90

• Milne et al., 2008. Soil organic carbon. The encyclopedia of Earth

• McVay K. A., Rice C.W.; 2006. Soil organic carbon and the global carbon cycle. Kansas State University

• Morgan, C.L.S., J.M. Norman, C.C. Molling, K. McSweeney, and B. Lowery. 2003. Evaluating soil data from several sources using a landscape model. p. 243-260.

• Pachepsky, Ya.A., Timlin D.J., and Rawls W.J. 2001. Soil water retention as related to topographic variables. Soil Sci. Soc. Am. J. 65:1787-1795.

• Powell M., Mills A., Marais C., 2006. Carbon Sequestration and restoration: challenges and opportunities in subtropical thicket. Department of water affair and forestry conference proceedings. Day 2, session 3, item 1.

• Reuter, G., 1991. 35 Jahre Rostocker Dauerversuche. I. Entwicklung der Humusgehalte. Archiv für Acker-. Pflanzenbau und Bodenkunde 35, pp. 357–364.

• Reeves III, J. B., G. W. McCarty, and J. J. Meisinger. 1999. Near infrared reflectance spectroscopy for the analysis of agricultural soils. J. Near Infrared Spectrosc. 7:179-193.

• Reeves III, J. B. and G. W. McCarty. 2001. Quantitative analysis of agricultural soils using near infrared reflectance spectroscopy and a fibre- optic probe. J. Near Infrared Spectrosc. 9:25-34.

56

• Reeves III, G. McCarty, T. Mimmo, 2002. The potential of diffuse spectroscopy for the determination of carbon inventories in the soil. Environ. Pollut. 116. S277-284

• Salgo, J. Nagy, J. Tarnoy, P. Marth, O. Palmai, G. Szabo- Kele, J. Near Infrared Spectrosc. 6, 1998, 199. • Schlesinger, W.H. 1990. Evidence from chronosequence studies for a low carbon storage potential of soils. Nature (London) 348: 232-234.

• Schrire BD, Lavin M, Lewis GP (2005) Global distribution patterns of the Leguminosae: insights from recent phylogenies. Biologiese Skrifter 55: 375–422

• Schwertmann U., Cornell R.M., 2003 The Iron Oxides: Structure, Properties, Reactions, Occurences and Uses. Pp. 433-438

• Selige, T., Bohner, J., and Schmidhalter, U., 2006. High resolution topsoil mapping using hyperspectral image and field data in multivariate regression modeling procedure. Geoderma 136, 235–244

• Shepherd, K.D., Walsh, M.G., 2002. Development of reflectance spectral libraries for characterization of soil properties. Soil Sci. Soc. Am. J. 66, 988–998.

• Stevens, A., Van Wesemael, B., Bartholomeus, H., Rosillon, D., Tychon, B., Eyal Ben-Dor, E., 2008. Laboratory, field and airborne spectroscopy for monitoring organic carbon content in agricultural soils. Geoderma 144 (1- 2), pp. 395-404

• Thompson L. M., 1957. Soils and Soil Fertility. McGreaw-Hill Book Company Inc., New-York.

• Torrent, J., Schwertmann, U., Fetcher, H., Alferez, F., 1983. Quantitative relationships between soil color and hematite content. Soil Sci. 13, 354– 358.

• Sadler, E.J., B.K. Gerwig, D.E. Evans, J.A. Millen, P.J. Bauer, and W.J. Busscher. 1999. Site-specificity of CERES-Maize model parameters: A case study in the Southeastern US Coastal Plain. p. 551-560. In J.V. Stafford (ed.) Precision agriculture ’99. Proc. 2nd European Conf. Sheffield Academic Press, UK.

• Savitzky A., Golay M.J.E., 1964. Analytical Chemistry 36 (1964) 1627.

• Skead, C.J. 1987. Historical mammal incidence in the Cape Province. Volume 2: The eastern half of the Cape Province, including the Ciskei,

57

Transkei and Griqualand. The Chief Directorate Nature and Environmental Conservation of the Cape of Good Hope, Cape Town, South Africa.

• Stuart-Hill, G.C. 1992. Effects of elephants and goats on the Kaffrarian succulent thicket of the Eastern Cape, South Africa. Journal of Applied Ecology 29:699-710.

• Stuart-Hill, G.C., and A.J. Aucamp. 1993. Carrying capacity of the succulent valley bushveld of the eastern Cape. African Journal of Range and Forage Science 10:1-10.

• Ting, I. P., and Hanscom Z., 1977. Induction of acid metabolism in Portulacaria afra. Plant Physiology. 59:511–514

• Tobias, R. 1995. An introduction to partial least squares regression. Proceedings of the Twentieth Annual SAS Users Group International Conference. Cary, NC: SAS Institute Inc., 1250-1257.

• Tsai F., Philpot W., 1998. Derivative analyses of hyperspectral data. Remote Sens. Environm. 66:41–51

• United Nations. 1992. The global partnership for environmental and development: a guide to Agenda 21. United Nations Conference on Environment and Development, Genéve

• Viscarra Rossel, R.A., Walvoort, D.J.J., McBratney, A.B., Janik, L.J., Skjemstad, J.O., 2006. Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma 131, 59–75.

• Viscarra Rossel, R.A. , 2008. ParLeS: Software for chemometric analysis of spectroscopic data. Chemometrics and Intelligent Laboratory Systems, 90: 72-83. doi:10.1016/j.chemolab.2007.06.006

• Vlok JHJ, Euston-Brown DIW, Cowling RM, 2003. Acocks’ Valley Bushveld 50 years on: new perspectives on the delimitation, characterisation and origin of subtropical thicket vegetation. South African Journal of Botany 69: 27–51

• Walkley, A., Black, I.A., 1934. An estimation of the Degtjareff method for determining soil organic matter and a proposed modification of the chromic acid titration method. Soil Science 37, 29–37

• Waiser T.H., Brown D.J., Hallmark C.T., 2007. In Situ Characterization of Soil Clay Content with Visible Near-Infrared Diffuse Reflectance Spectroscopy. Soil Sci. Soc. Am. J. 71:389–396

58

• Wold, S., M. Sjostrom, and L. Eriksson. 2001. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58:109-130.

• Workman, J., Jr. and J. Shenk. 2004. Understanding and using the near- infrared spectrum as an analytical method. p. 3-10.

59

8. Appendices

Appendix 1: visited points of the study area Point Latitude Longitude Height Farm land use 1 -33.571580185 25.331854550 343.88 Greenfields goat grazing 2 -33.570079135 25.331828883 338.47 Greenfields goat grazing 3 -33.569430218 25.331662883 346.20 Greenfields goat grazing 4 -33.516401935 25.353759150 237.88 Schilpad Laagte goat grazing 5 -33.516399000 25.353759000 237.88 Schilpad Laagte goat grazing 6 -33.516857435 25.353961433 244.01 Schilpad Laagte goat grazing 7 -33.515870902 25.338798917 283.28 Schilpad Laagte goat grazing 8 -33.515005802 25.327518083 303.81 Schilpad Laagte goat grazing 9 -33.515935252 25.327914383 306.75 Schilpad Laagte goat grazing 10 -33.535447468 25.375183533 213.80 Schilpad Laagte goat grazing 11 -33.535414518 25.375766233 212.60 Schilpad Laagte goat grazing 12 -33.494955618 25.353576633 244.28 Stenbokvlakte goat grazing 13 -33.494285502 25.352257733 241.68 Stenbokvlakte goat grazing 14 -33.493769868 25.350922700 239.11 Stenbokvlakte goat grazing 15 -33.490330402 25.347624383 229.97 Stenbokvlakte goat grazing 16 -33.489177335 25.344937083 222.35 Stenbokvlakte goat grazing 17 -33.471001335 25.330311883 193.58 Stenbokvlakte goat grazing 18 -33.471464868 25.331488933 194.60 Stenbokvlakte goat grazing 19 -33.465336502 25.335176750 208.30 Stenbokvlakte goat grazing 20 -33.464826502 25.335975100 210.15 Stenbokvlakte goat grazing 21 -33.459420418 25.346851233 229.90 Stenbokvlakte goat grazing 22 -33.458602968 25.346347700 229.33 Stenbokvlakte goat grazing 23 -33.439098285 25.340372633 221.43 Stenbokvlakte goat grazing 24 -33.440537902 25.325574400 169.44 Stenbokvlakte goat grazing 25 -33.425994818 25.331467317 151.60 Correctional Ser game reserve 26 -33.424111885 25.336841550 146.84 Correctional Ser game reserve 27 -33.423993052 25.336369000 147.38 Correctional Ser game reserve 28 -33.421199285 25.332823017 150.65 Correctional Ser game reserve 29 -33.420804335 25.332745900 150.28 Correctional Ser game reserve 30 -33.414616635 25.336074350 150.61 Correctional Ser game reserve 31 -33.410224318 25.332680467 170.46 Correctional Ser game reserve 32 -33.409826168 25.332337783 175.73 Correctional Ser game reserve 33 -33.403719918 25.327326317 239.52 Correctional Ser game reserve 34 -33.404332218 25.327808150 233.34 Correctional Ser game reserve 35 -33.094910668 25.290047633 349.36 Fonteinsplaas goat grazing 36 -33.092698402 25.288235333 344.36 Fonteinsplaas goat grazing 37 -33.090848402 25.289070017 337.90 Fonteinsplaas goat grazing 38 -33.105898618 25.293559717 339.14 Fonteinsplaas goat grazing 39 -33.105333852 25.292033683 340.48 Fonteinsplaas goat grazing 40 -33.082052735 25.296876867 357.35 Fonteinsplaas goat grazing 41 -33.076082635 25.301948550 364.15 Fonteinsplaas goat grazing 42 -33.070203735 25.301603217 374.17 Fonteinsplaas goat grazing 43 -33.064583435 25.302052483 382.68 Fonteinsplaas goat grazing 44 -33.057485985 25.301898083 393.50 Fonteinsplaas goat grazing 45 -33.048088285 25.304074200 410.85 Winterfontein goat grazing 60

46 -33.042163852 25.307910317 437.52 Winterfontein goat grazing 47 -33.038248602 25.304597817 466.96 Winterfontein goat grazing 48 -33.027705652 25.296265133 472.79 Winterfontein goat grazing 49 -33.025708652 25.287452283 496.95 Winterfontein goat grazing 50 -33.026824868 25.280935800 521.84 Winterfontein goat grazing 51 -33.005154685 25.298272900 532.13 Kalkfontein game reserve 52 -33.005163902 25.302410300 565.90 Kalkfontein game reserve 53 -33.001983635 25.307428217 617.02 Kalkfontein game reserve 54 -32.991813018 25.302810283 579.28 Kruizementfontei game reserve 55 -32.985778468 25.304465750 583.60 Kruizementfontei game reserve 56 -32.979325968 25.303611767 585.20 Kruizementfontei game reserve 57 -32.975605468 25.298175050 610.82 Kruizementfontei game reserve 58 -32.968906202 25.287853050 725.57 Kruizementfontei game reserve 59 -32.951150168 25.297604217 878.50 Kruizementfontei game reserve 60 -32.975339802 25.294959550 619.93 Kruizementfontei game reserve 61 -32.975733968 25.296367250 605.49 Kruizementfontei game reserve 62 -32.932991668 25.323346567 931.59 Verstokenfontein game reserve 63 -32.931625735 25.317728883 932.48 Verstokenfontein game reserve 64 -32.894412002 25.305723683 627.12 Gonakraal goat grazing 65 -32.891829218 25.303894633 622.32 Gonakraal goat grazing 66 -32.888269418 25.298175967 608.40 Gonakraal goat grazing 67 -32.881949218 25.292377900 580.37 Gonakraal goat grazing 68 -32.879371085 25.281204450 592.07 Skitfontein goat grazing 69 -32.875419452 25.281564817 637.08 Skitfontein goat grazing 70 -32.872690002 25.278896350 594.09 Skitfontein goat grazing 71 -32.868934885 25.278749500 661.42 Skitfontein goat grazing 72 -32.867147302 25.282673317 693.62 Skitfontein goat grazing 73 -32.863417368 25.286791150 691.99 Skitfontein goat grazing 74 -32.852727835 25.290456333 781.52 Skitfontein goat grazing 75 -32.852473285 25.290386383 781.56 Skitfontein goat grazing 76 -32.824264918 25.263992533 693.22 De Draai goat grazing 77 -32.831654535 25.271482067 746.92 De Draai goat grazing 78 -32.837207752 25.272510850 786.70 De Draai goat grazing 79 -32.838416435 25.276847150 934.42 De Draai goat grazing 80 -32.823079552 25.286797650 884.80 De Draai goat grazing 81 -32.817877868 25.289180867 832.98 De Draai goat grazing 82 -32.804285918 25.272035167 674.15 Vaalklip game reserve 83 -32.796881002 25.285006183 830.86 Vaalklip game reserve 84 -32.791681785 25.290977467 911.53 Vaalklip game reserve 85 -32.790277502 25.305503717 989.95 Vaalklip game reserve 86 -32.793562568 25.286683817 868.52 Vaalklip game reserve 87 -32.765920102 25.283557700 717.26 Stroh's Fontein goat grazing 88 -32.761424568 25.284210633 744.88 Stroh's Fontein goat grazing 89 -32.756701668 25.285593750 751.37 Stroh's Fontein goat grazing 90 -32.750650385 25.283481067 721.86 Stroh's Fontein goat grazing 91 -32.747607535 25.281558317 733.49 Stroh's Fontein goat grazing 92 -32.741213102 25.284266083 799.78 Stroh's Fontein goat grazing 93 -32.739632485 25.283620583 806.62 Stroh's Fontein goat grazing 94 -32.716142152 25.280357367 726.76 Toekoms goat grazing 95 -32.714996552 25.281644800 723.01 Toekoms goat grazing

61

96 -32.710952468 25.284062067 725.03 Toekoms goat grazing 97 -32.703805585 25.280455983 746.57 Toekoms goat grazing 98 -32.703627885 25.279980617 747.98 Toekoms goat grazing 99 -32.702897368 25.279691167 752.70 Toekoms goat grazing 100 -32.700308652 25.275740517 738.99 Toekoms goat grazing 101 -32.693143535 25.275693750 764.12 Toekoms goat grazing 102 -32.693782035 25.284422100 771.50 Toekoms goat grazing 103 -32.705386452 25.298635767 804.09 Slot Van Candebo goat grazing 104 -32.665370285 25.277594350 806.88 Allemans Fontein goat grazing 105 -32.664561885 25.275580550 796.25 Allemans Fontein goat grazing 106 -32.655023802 25.278696467 825.06 Stockpoort goat grazing 107 -32.624270118 25.245913667 765.68 Stockpoort goat grazing 108 -32.620941385 25.256703133 763.22 Stockpoort goat grazing 109 -32.614987218 25.272630250 792.64 Stockpoort goat grazing 110 -32.612627802 25.277227050 815.39 Stockpoort goat grazing 111 -32.609626752 25.279603400 817.52 Stockpoort goat grazing 112 -32.610213302 25.285414533 829.51 Stockpoort goat grazing 113 -32.611529602 25.289142633 840.18 Stockpoort goat grazing

62

Appendix II: soil chemical analyses results for 0-20cm samples collected during the soil sampling campaign 0-20cm % Plot nr. Clay OC Fe 1 25.64 1.36 7.28 2 15.36 0.51 2.20 3 12.14 1.19 5.03 4 23.85 2.18 4.63 5 14.52 1.78 4.40 6 9.34 3.29 2.60 7 44.60 1.94 2.68 8 12.40 2.19 4.63 9 23.40 1.39 3.73 10 4.93 0.29 1.95 11 11.16 1.62 2.58 12 10.15 1.83 2.75 13 11.14 2.31 3.03 14 11.14 1.09 2.40 15 10.17 0.94 1.40 16 10.05 0.42 2.20 17 4.95 0.21 2.13 18 7.50 0.24 1.35 19 8.16 1.35 2.30 20 5.04 1.28 3.75 21 13.18 1.05 2.15 22 18.50 1.54 1.48 23 17.73 1.94 1.88 24 8.97 0.32 1.80 25 19.56 1.26 2.30 26 21.69 1.58 2.20 27 16.21 0.84 3.20 28 8.33 3.6 1.85 29 17.65 1.66 3.93 30 25.35 0.87 3.43 31 21.01 3.56 1.80 32 20.72 4.82 1.50 33 26.52 5.05 1.80 34 13.52 3.55 4.08 35 12.00 0.37 4.20 36 9.07 0.36 4.45 37 20.20 0.22 5.05 38 25.41 0.43 4.23 39 24.29 0.23 2.85 40 16.20 0.51 3.48 41 6.14 0.35 3.45 42 19.63 0.47 3.50 43 17.49 0.92 3.08 44 20.58 0.77 2.10 45 16.65 1.76 2.13 46 4.02 1.25 2.53 47 13.40 2.32 2.13 48 22.55 1.07 2.00 49 8.15 1.12 2.40 50 21.80 0.93 2.13 51 16.32 1.17 3.63 52 15.17 1.25 3.00 53 22.56 2.1 2.68 54 20.53 1.16 3.10 55 27.58 1.2 3.40

63

56 15.33 0.98 2.28 57 21.41 1.33 2.70 58 10.14 0.98 2.13 59 20.70 1.38 3.23 60 26.52 1.33 2.55 61 19.01 2.95 2.53 62 11.07 0.4 2.33 63 16.56 0.8 3.03 64 17.20 0.73 3.03 65 11.02 0.2 2.30 66 14.32 0.57 2.55 67 21.71 2.47 3.05 68 14.37 1.47 2.73 69 20.56 1.44 2.85 70 17.27 0.71 3.03 71 15.07 1.2 2.03 72 15.58 2.26 2.68 73 14.20 0.56 2.75 74 18.61 1.21 3.00 75 14.65 2.01 3.88 76 18.49 1 3.10 77 22.58 0.72 2.65 78 36.95 2.79 3.03 79 14.25 2.32 2.98 80 9.56 2.65 2.68 81 19.58 1.82 2.60 82 33.37 0.54 2.65 83 21.45 1.47 2.98 84 25.03 2.16 2.48 85 3.94 0.32 2.60 86 26.80 1.38 2.13 87 15.26 0.75 1.53 88 5.21 1.93 2.18 89 19.73 1.23 2.60 90 20.32 0.45 3.05 91 19.40 0.43 2.68 92 18.73 1.76 2.88 93 18.59 1.23 1.73 94 11.09 0.52 1.95 95 11.13 0.82 2.25 96 13.08 0.62 2.35 97 14.33 0.99 2.28 98 24.92 1.69 2.53 99 17.45 0.86 2.55 100 13.35 1.16 2.20 101 12.28 1.54 2.68 102 19.61 1.01 3.63 103 39.07 1.77 2.20 104 16.59 2.42 2.43 105 17.48 0.51 3.28 106 18.25 0.44 3.20 107 13.13 0.54 2.25 108 11.14 1.28 2.03 109 7.99 0.36 1.85 110 7.01 0.69 2.23 111 9.07 1.03 2.33 112 17.34 0.92 2.20 113 18.65 0.86 3.43

64

Appendix III: soil chemical analyses results for the topsoil samples collected during the field spectroscopy campaign topsoil % Plot nr Clay OC Fe 1 11.07 1.48 6.43 2 4.96 0.54 3.08 3 8.07 1.33 4.18 4 12.08 1.33 2.63 5 16.33 1.49 2.90 6 9.24 4.32 2.28 7 8.10 1.27 1.90 8 8.10 1.72 3.68 9 14.12 1.29 3.55 10 1.90 0.62 1.48 11 11.40 1.89 4.00 12 14.19 2.60 2.25 13 9.19 1.84 2.68 14 9.16 2.29 1.88 15 9.11 1.05 1.90 16 4.97 0.66 1.90 17 2.93 0.38 1.35 18 0.90 0.22 0.73 19 9.04 1.54 3.33 20 5.01 1.78 1.85 21 7.00 1.06 2.03 22 3.99 1.81 2.28 23 15.42 2.65 2.25 24 12.03 0.33 1.53 25 15.28 2.36 1.73 26 39.15 2.17 6.90 27 15.12 1.21 3.28 28 16.28 1.92 3.23 29 15.19 1.73 2.95 30 9.07 2.40 1.93 31 19.39 3.58 1.63 32 20.42 4.32 2.08 33 15.42 5.92 4.25 34 19.50 4.21 3.38 35 13.00 0.21 4.10 36 10.99 0.25 3.53 37 5.94 0.45 4.18 38 7.96 0.48 2.78 39 5.92 0.18 3.10 40 9.97 0.47 3.03 41 3.93 0.43 1.95 42 12.08 0.27 2.00 43 16.07 0.27 2.10 44 8.97 0.28 1.88 45 18.40 1.62 2.08 46 22.52 1.32 1.78 47 7.25 5.92 1.53 48 10.02 0.80 1.98 49 11.04 0.57 1.73 50 14.13 0.72 3.05 51 9.07 0.88 2.28

65

52 9.49 2.86 2.48 53 14.59 2.54 2.45 54 2.94 0.56 2.30 55 18.35 0.48 2.83 56 1.94 1.01 2.83 57 9.00 0.97 1.73 58 11.09 0.96 3.30 59 16.26 0.99 2.68 60 32.25 1.05 1.90 61 10.03 0.92 2.25 64 16.12 0.87 1.53 65 5.94 0.24 2.78 66 18.22 0.89 2.55 67 15.21 3.01 2.80 68 31.45 0.58 2.90 69 3.96 1.26 2.60 70 10.13 1.96 2.75 71 15.08 1.19 2.55 72 20.36 1.57 2.88 73 8.02 1.16 2.83 74 23.33 0.86 3.05 75 13.16 1.06 2.60 76 10.12 0.72 2.70 77 8.99 0.37 3.48 78 20.95 6.03 1.90 79 19.67 2.30 2.70 80 21.81 2.31 2.70 81 10.29 1.62 3.23 82 23.49 0.42 2.60 83 15.44 1.18 3.28 84 9.44 2.26 2.80 85 1.10 0.21 2.33 86 13.21 0.75 1.75 87 23.42 0.74 1.95 88 12.21 0.93 2.68 89 11.22 1.37 2.18 90 9.99 0.31 2.45 91 19.16 0.24 3.08 92 6.07 1.56 2.60 93 17.74 2.89 1.78 94 9.12 0.96 2.18 95 9.10 0.90 1.83 96 12.18 0.92 2.23 97 26.92 1.65 1.80 98 26.25 0.46 2.15 99 22.21 0.46 2.28 100 11.11 0.40 2.10 101 12.20 1.06 2.23 102 12.28 0.79 3.15 103 19.49 1.91 1.70 104 7.07 0.97 3.68 105 15.20 0.30 3.48 106 1.92 0.25 1.48 107 18.20 0.42 1.85 108 15.55 0.89 1.95 109 6.99 0.47 1.83

66

110 2.94 0.68 2.15 111 12.33 1.65 2.75 112 10.17 0.74 1.70 113 16.27 0.59 2.50

Appendix IV: 0-20 and topsoil training sets training sets 0-20 topsoil plot Clay% plot OC% plot FE% plot Clay% plot OC% plot Fe% 1 25.64 2 0.51 1 7.28 1 11.07 1 1.48 1 6.43 3 12.14 3 1.19 4 4.63 3 8.07 2 0.54 2 3.08 4 23.85 4 2.18 5 4.40 4 12.08 3 1.33 3 4.18 6 9.34 5 1.78 6 2.60 7 8.10 6 4.32 4 2.63 7 44.60 7 1.94 7 2.68 9 14.12 7 1.27 5 2.90 9 23.40 8 2.19 8 4.63 11 11.40 8 1.72 6 2.28 10 4.93 11 1.62 9 3.73 12 14.19 11 1.89 8 3.68 11 11.16 13 2.31 14 2.40 13 9.19 12 2.6 12 2.25 12 10.15 14 1.09 15 1.40 14 9.16 14 2.29 14 1.88 13 11.14 15 0.94 16 2.20 15 9.11 15 1.05 15 1.90 18 7.50 16 0.42 17 2.13 16 4.97 16 0.66 16 1.90 19 8.16 17 0.21 19 3.75 18 0.90 17 0.38 17 1.35 20 5.04 18 0.24 20 2.15 19 9.04 18 0.22 18 0.73 24 8.97 19 1.35 21 1.48 20 5.01 19 1.54 19 3.33 25 19.56 20 1.28 22 1.88 21 7.00 20 1.78 20 1.85 26 21.69 22 1.54 24 2.30 22 3.99 22 1.81 21 2.03 29 17.65 24 0.32 25 2.20 25 15.28 23 2.65 22 2.28 31 21.01 27 0.84 26 3.20 26 39.15 24 0.33 23 2.25 36 9.07 29 1.66 28 3.93 27 15.12 25 2.36 24 1.53 38 25.41 30 0.87 30 1.80 28 16.28 26 2.17 25 1.73 40 16.20 31 3.56 32 1.80 29 15.19 30 2.4 26 6.90 42 19.63 32 4.82 33 4.08 30 9.07 33 5.92 27 3.28 43 17.49 33 5.05 36 5.05 31 19.39 34 4.21 28 3.23 44 20.58 34 3.55 37 4.23 33 15.42 35 0.21 30 1.93 45 16.65 36 0.36 38 2.85 35 13.00 36 0.25 31 1.63 46 4.02 39 0.23 40 3.45 36 10.99 38 0.48 33 4.25 47 13.40 42 0.47 41 3.50 37 5.94 39 0.18 34 3.38 48 22.55 43 0.92 42 3.08 38 7.96 41 0.43 35 4.10 49 8.15 44 0.77 43 2.10 42 12.08 43 0.27 36 3.53 51 16.32 45 1.76 44 2.13 43 16.07 44 0.28 38 2.78 53 22.56 46 1.25 47 2.00 44 8.97 45 1.62 40 3.03 54 20.53 47 2.32 49 2.13 45 18.40 46 1.32 42 2.00 56 15.33 48 1.07 50 3.63 46 22.52 47 5.92 43 2.10 57 21.41 50 0.93 51 3.00 47 7.25 54 0.56 44 1.88

67

58 10.14 52 1.25 52 2.68 48 10.02 58 0.96 45 2.08 59 20.70 53 2.1 53 3.10 52 9.49 59 0.99 50 3.05 60 26.52 54 1.16 54 3.4 53 14.59 60 1.05 53 2.45 61 19.01 55 1.2 56 2.70 54 2.94 61 0.92 54 2.30 62 11.07 58 0.98 57 2.13 55 18.35 64 0.87 56 2.83 63 16.56 59 1.38 60 2.53 56 1.94 65 0.24 57 1.73 64 17.20 60 1.33 61 2.33 58 11.09 66 0.89 59 2.68 65 11.02 61 2.95 62 3.03 60 32.25 67 3.01 64 1.53 67 21.71 62 0.4 63 3.03 64 16.12 68 0.58 66 2.55 68 14.37 63 0.8 64 2.30 65 5.94 69 1.26 67 2.80 71 15.07 64 0.73 65 2.55 67 15.21 70 1.96 68 2.90 72 15.58 65 0.2 66 3.05 68 31.45 71 1.19 69 2.60 73 14.20 67 2.47 67 2.73 69 3.96 74 0.86 73 2.83 74 18.61 68 1.47 72 2.75 70 10.13 75 1.06 75 2.60 75 14.65 69 1.44 73 3.00 72 20.36 76 0.72 78 1.90 76 18.49 73 0.56 76 2.65 74 23.33 78 6.03 79 2.70 78 36.95 74 1.21 77 3.03 76 10.12 79 2.3 80 2.70 79 14.25 75 2.01 78 2.98 77 8.99 81 1.62 83 3.28 80 9.56 77 0.72 79 2.68 78 20.95 83 1.18 84 2.80 81 19.58 78 2.79 80 2.60 79 19.67 86 0.75 86 1.75 82 33.37 81 1.82 83 2.48 80 21.81 87 0.74 87 1.95 84 25.03 85 0.32 86 1.53 81 10.29 89 1.37 88 2.68 85 3.94 86 1.38 87 2.18 82 23.49 90 0.31 90 2.45 86 26.80 88 1.93 88 2.60 83 15.44 92 1.56 91 3.08 87 15.26 89 1.23 90 2.68 84 9.44 93 2.89 92 2.60 88 5.21 91 0.43 91 2.88 85 1.10 94 0.96 93 1.78 89 19.73 92 1.76 92 1.73 86 13.21 96 0.92 94 2.18 90 20.32 94 0.52 93 1.95 88 12.21 98 0.46 96 2.23 92 18.73 96 0.62 94 2.25 89 11.22 99 0.46 97 1.80 93 18.59 97 0.99 95 2.35 90 9.99 100 0.4 98 2.15 95 11.13 99 0.86 97 2.28 93 17.74 101 1.06 99 2.28 96 13.08 100 1.16 98 2.53 95 9.10 102 0.79 102 3.15 97 14.33 101 1.54 99 2.55 98 26.25 103 1.91 104 3.68 98 24.92 102 1.01 103 2.20 101 12.20 104 0.97 105 3.48 99 17.45 104 2.42 104 2.43 103 19.49 106 0.25 106 1.48 100 13.35 105 0.51 105 3.28 106 1.92 107 0.42 108 1.95 101 12.28 106 0.44 106 3.20 107 18.20 108 0.89 109 1.83 106 18.25 107 0.54 107 2.25 109 6.99 109 0.47 110 2.15 107 13.13 108 1.28 108 2.03 110 2.94 110 0.68 111 2.75 108 11.14 109 0.36 109 1.85 111 12.33 111 1.65 112 1.7 110 7.01 110 0.69 112 2.20 113 16.27 113 0.59 113 2.50 112 17.34 111 1.03 113 3.43

68

Appendix V: 0-20 and topsoil test sets test sets 0-20 topsoil plot Clay% plot OC% plot FE% plot Clay% plot OC% plot Fe% 2 15.36 1 1.36 2 2.20 2 4.96 4 1.33 7 1.90 5 14.52 6 3.29 3 5.03 5 16.33 5 1.49 9 3.55 8 12.40 9 1.39 10 1.95 6 9.24 9 1.29 10 1.48 14 11.14 10 0.29 11 2.58 8 8.10 10 0.62 11 4.00 15 10.17 12 1.83 12 2.75 10 1.90 13 1.84 13 2.68 16 10.05 21 1.05 13 3.03 17 2.93 21 1.06 29 2.95 17 4.95 23 1.94 18 2.30 23 15.42 27 1.21 32 2.08 21 13.18 25 1.26 23 1.80 24 12.03 28 1.92 37 4.18 22 18.50 26 1.58 27 1.85 32 20.42 29 1.73 39 3.10 23 17.73 28 3.6 29 3.43 34 19.50 31 3.58 41 1.95 27 16.21 35 0.37 31 1.50 39 5.92 32 4.32 46 1.78 28 8.33 37 0.22 34 4.20 40 9.97 37 0.45 47 1.53 30 25.35 38 0.43 35 4.45 41 3.93 40 0.47 48 1.98 32 20.72 40 0.51 39 3.48 49 11.04 42 0.27 49 1.73 33 26.52 41 0.35 45 2.53 50 14.13 48 0.8 51 2.28 34 13.52 49 1.12 46 2.13 51 9.07 49 0.57 52 2.48 35 12.00 51 1.17 48 2.40 57 9.00 50 0.72 55 2.83 37 20.20 56 0.98 55 2.28 59 16.26 51 0.88 58 3.30 39 24.29 57 1.33 58 3.23 61 10.03 52 2.86 60 1.90 41 6.14 66 0.57 59 2.55 66 18.22 53 2.54 61 2.25 50 21.80 70 0.71 68 2.85 71 15.08 55 0.48 65 2.78 52 15.17 71 1.2 69 3.03 73 8.02 56 1.01 70 2.75 55 27.58 72 2.26 70 2.03 75 13.16 57 0.97 71 2.55 66 14.32 76 1 71 2.68 87 23.42 72 1.57 72 2.88 69 20.56 79 2.32 74 3.88 91 19.16 73 1.16 74 3.05 70 17.27 80 2.65 75 3.10 92 6.07 77 0.37 76 2.70 77 22.58 82 0.54 81 2.65 94 9.12 80 2.31 77 3.48 83 21.45 83 1.47 82 2.98 96 12.18 82 0.42 81 3.23 91 19.40 84 2.16 84 2.60 97 26.92 84 2.26 82 2.60 94 11.09 87 0.75 85 2.13 99 22.21 85 0.21 85 2.33 102 19.61 90 0.45 89 3.05 100 11.11 88 0.93 89 2.18 103 39.07 93 1.23 96 2.10 102 12.28 91 0.24 95 1.83 104 16.59 95 0.82 100 2.20 104 7.07 95 0.9 100 2.10 105 17.48 98 1.69 101 2.68 105 15.20 97 1.65 101 2.23 109 7.99 103 1.77 102 3.63 108 15.55 105 0.3 103 1.70 111 9.07 112 0.92 110 2.23 112 10.17 112 0.74 107 1.85 113 18.65 113 0.86 111 2.33

69