Use of Decision Tree Analysis for Predictive Soils Mapping and Implementation on the Malheur County, Oregon Initial Soil Survey
Total Page:16
File Type:pdf, Size:1020Kb
AN ABSTRACT OF THE THESIS OF Sarah Jane Hash for the degree of Master of Science in Soil Science presented on May 29, 2008. Title: Use of Decision Tree Analysis for Predictive Soils Mapping and Implementation on the Malheur County, Oregon Initial Soil Survey Abstract approved: Jay S. Noller Soil surveys provide essential information for making land use and management decisions on publicly-owned lands in the semi-arid Great Basin. Soil maps produced with conventional mapping techniques are time-consuming, costly, and do not explicitly document the soil scientist’s mental soil-landscape model. Predictive soils mapping using decision tree analysis (DTA) can increase mapping efficiency and accuracy by extracting relationships between soil types and environmental variables, applying these relationships to predict soil types for unmapped areas, and explicitly documenting the process. While DTA has been used for soils mapping in the past, no research exists concerning the use of DTA for predictive soils mapping on an active Natural Resources Conservation Service (NRCS) soil survey. This research documents the procedure for producing and validating preliminary soils maps using DTA on the Malheur County, Southern Part Soil Survey (MCSPSS), documents interactions with survey staff, and proposes a system for predictive mapping implementation on the MCSPSS and other soil surveys. In the early stages of the project, four sets of predictive maps were produced. The June 16, 2007 predictive map used Owyhee County, Idaho soil survey information as training data to predict soil types for three adjacent quads in Malheur County. Predictive accuracy was low (67% at the order level, 61% at the suborder level, and 35% at the great group level) and this approach was abandoned. Subsequent predictive runs used soils mapping within the MCSPSS area, completed during the 2006 and 2007 field seasons, to predict soil map units (SMUs) for adjacent quads. The July 23, 2007 predictive map used training data from four east-west trending quads to SMUs for a surrounding annulus of quads. Accuracy improved significantly (87% at the order level, 73% at the suborder level, 67% at the great group level, and 53% at the subgroup level). The August 3, 2007 predictive map used all completed mapping from the 2006 and 2007 field seasons to predict SMUs for adjacent quads. Introduction of additional training data, representing greater area and more environmental variable combinations, did not improve accuracy (80% at the order level, 67% at the suborder level, 55% at the great group level, and 32% at the subgroup level). The September 14, 2007 predictive run used the same input variables and soils training data, but the soils information was recoded from SMUs to the subgroup level of soil classification. Accuracy was expected to increase because contiguous mapped areas would be larger and encompass more potential variation at a higher level of classification. However, accuracy was not significantly different from the August 3, 2007 predictive run. These results led to the prediction of SMUs for one quad, Threemile Hill, based on training data from four adjacent quads (October 12, 2007 predictive run). Accuracy improved substantially (91% at the order level, 83% at the suborder level, 83% at the great group level, 34% at the subgroup level). Confusion matrix analysis (for SMUs) was performed for the training area and the predicted area, and yielded overall accuracies and Kappa coefficients of 98.5% and 0.98 and 74.0% and 0.67 for the training and predicted areas, respectively. The April 10, 2008 predictive run used the same input variables and training data as the October 12, 2007 run, but introduced surficial geology variables (surficial age and surficial lithology). The overall accuracy and Kappa coefficient were not significantly different from the October 12, 2007 run, but line placement was more precise at delineation boundaries and the locations of prediction errors were different. DTA is effective for producing preliminary soils maps, but personnel, computer hardware, and software constraints currently prevent implementation on the MCSPSS. Skepticism and resistance among field scientists is also a major barrier. The proposed system for implementation requires certain personnel: an experienced field scientist with area-specific knowledge who would carry out the DTA and produce pre-maps, a GIS analyst who would provide data acquisition, data preparation and digitization support, and a soil survey project leader who would oversee the predictive mapping process. All would attend a week-long predictive mapping training session. An agency employee would conduct training sessions and serve as the support and issue resolution contact. Full documentation of procedures would be provided as an NRCS technical note. Adequate support in early stages of adoption is essential for long-term success. ©Copyright by Sarah Jane Hash May 29, 2008 All Rights Reserved Use of Decision Tree Analysis for Predictive Soils Mapping and Implementation on the Malheur County, Oregon Initial Soil Survey by Sarah Jane Hash A THESIS Submitted to Oregon State University in partial fulfillment of the requirements for the degree of Master of Science Presented May 29, 2008 Commencement June 2009 Master of Science thesis of Sarah Jane Hash presented on May 29, 2008. APPROVED: Major Professor, representing Soil Science Head of the Department of Crop and Soil Science Dean of the Graduate School I understand that my thesis will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my thesis to any reader upon request. Sarah Jane Hash, Author ACKNOWLEDGEMENTS First of all, I’d like to thank my family for their unconditional love, numerous sacrifices, great patience, and inspiration. Thanks to my dad, John, a true hillbilly philosopher, for sharing his awareness of the world, his inquisitive and questioning nature, and reminding me that people and their ideas can and should evolve with time. Thanks to my mom, Deb, who has shown me that selflessness is a great virtue, that prayer can move mountains, and that true understanding of soil and the land can’t be attained in a classroom. Thanks to my brother and sister, Thomas and Emily, for their friendship, perseverance, and strength and wisdom beyond their years. I’d like to give special thanks to my partner, Sylvan, for sharing a wonderful journey with me that has traversed the country and spanned many years. His great love, emotional support, and constant presence have kept me pushing forward. Many friends, too numerous to be named here, deserve equal praise for keeping me grounded, giving advice, listening, and enjoying good times. Thanks to my advisor, Jay Noller, for sharing his expansive geomorphology and soil genesis knowledge and for mindfully guiding this research. Thanks to NRCS soil scientists Alina Rice, Mark Keller and Shanna Bernal-Fields and BLM range scientist Charlie Tackman for facilitating field work, providing soil and vegetation descriptions, and generously sharing their expert knowledge of the area. I’d like to thank fellow predictive mapping graduate students Abdelhamid Elnaggar, Sheila Slevin, Melanie Malone and Jeff Pace for their cooperative input. Sylvan Williams also provided computer and software support for this project. Thanks to the NRCS National Geospatial Development Center for funding this research and providing data. NRCS MO-1 and MO-3 Region offices provided financial, data and personnel support. TABLE OF CONTENTS Page 1. Introduction..................................................................................................................... 1 2. Literature Review............................................................................................................ 5 2.1. Geologic Setting ...................................................................................................... 5 2.1.1. Regional Geology ............................................................................................. 5 2.1.2. Local Area Geology.......................................................................................... 6 2.2. Vegetation and Climatic History ............................................................................. 8 2.2.1. Pleistocene ........................................................................................................ 9 2.2.2. Holocene ......................................................................................................... 12 2.3. Study Area Soils .................................................................................................... 13 2.3.1. Surface Features.............................................................................................. 13 2.3.2. A Horizons and Epipedons ............................................................................. 17 2.3.3. Subsurface Horizons and Features.................................................................. 19 2.3.4. Regional Soil Surveys..................................................................................... 21 2.4. Decision-Tree Analysis in Predictive Soils Mapping............................................ 22 2.5. Technology Acceptance......................................................................................... 24 3. Use of Decision Tree Classifier Software for Predictive Soils Mapping on the Initial Soil Survey of Malheur County, Oregon...........................................................