INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

in the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g.. maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

ProQuest Information and Learning 300 North Zeeb Road. Ann Arbor, Ml 48106-1346 USA 800-521-0600 UMI

A THEORETICAL BASIS AND METHODOLOGY FOR THE QUANTITATIVE EVALUATION OF THEMATIC MAP SERIES FROM SAR/INSAR DATA

DISSERTATION

Presented in Partial Fulfillment of the Requirements for

the Degree Doctor of Philosophy in the Graduate

School of The Ohio State University

By

PaulaJ. Stevenson, M.S. *****

The Ohio State University 2001

Dissertation Committee: Approved by

Dr. N. W. J. Hazelton, Advisor

Dr. J. Raul Ramirez, Co-Advisor Advisor Dr. Ayman F. Habib Geodetic Science and Surveying UMI Number: 3022579

UMI

UMI Microform 3022579 Copyright 2001 by Bell & Howell Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

Bell & Howell Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor. Ml 48106-1346 ABSTRACT

Synthetic aperture radar (SAR) and interferometric SAR (InS AR) data are

increasingly being used for specific operational purposes such as detailed elevation maps,

detection of military targets, and coastline mapping of perpetually cloud-covered areas.

One topic that has been studied extensively since the 1970’s is the generation of thematic

maps from this data. However, most of the relevant literature relies on highly labor-

intensive approaches to yield “accurate” results for a particular scene, by fine-tuning

parameters to minimize the “error” in the scene (as compared to sampled ground truth for

the same scene). Consequently, it remains to be seen whether or how these data can be

used to produce thematic map series efficiently and reliably in the face of varying

landscapes, sensors, processors, classifiers, and output requirements. To the best of our

knowledge, no one has yet examined the linked, complex, and multi-faceted issues

involved in using S AR/InSAR data for this purpose; indeed, even a basis for conducting

such a study has not been determined.

This study adapts recent ISO (International Organization of Standardization)

standards on measurand, repeatability, and reproducibility and applies them to the study

of these issues. The standards are applied to analyze the range of measurement

uncertainties associated with the end-to-end processes that are involved in generating thematic maps. These processes are: (1) the physical interaction of the SAR/InSAR

11 signal with various terrain and landscape characteristics; (2) antenna characteristics and

signal processing steps in generating an image; (3) image classification models and

algorithms; and (4) standard map output requirements. The primary outcome is the

development of a methodology through applying the ISO principles to thematic map

classification of SAR/InSAR data. The methodology is expected to aid in determining

the expected quality of a SAR/InSAR-based thematic map series and its fitness for

intended end uses, by associating a fuller measure of confidence with the results.

Kevwords: synthetic aperture radar (SAR), interferometric synthetic aperture radar

(InSAR), image classification, thematic map series, measurement uncertainty.

Ill Dedicated to the Giver of all life and my Sustainer

IV ACKNOWLEDGMENTS

“It takes a village to raise a child” was a common phrase of the 90’s. I strongly

identify with this sentiment when it comes to my own painfully slow and not-so-steady

progress—that it does, indeed, take a community to complete a Ph D. degree. In my own

case, the community has been larger than most, through studies that first began when our

oldest son was 2 and culminate now that he is 21. Over this 2 decades-span of part-time

studies and part-time work, through great patience, support, and grace from many family

members, professors, supervisors, and friends, I was ushered through a bachelors degree

in 1989, a masters in 1993, and finally a Ph D. in 2001. So ends thousands of hours of

study, accompanied by the raising of a husband and four energetic and engaging sons.

My supporters over the years are too numerous to mention—nonetheless, I will

try to enumerate those who stand out the most, beginning with my family. My husband,

Dave (“beloved”) and my greatest friend, is and always will be first and foremost. He has been a steady rock and constant encourager. I can never repay his enormous sacrifices of career and income or his dedication to me and our children. For during the long days and sometimes nights, he has always been there when I could not, carrying out the many activities needed to run a lively and growing family. Next, I offer unending gratitude to our sons Bill, Rob, Tim, and Eric, whose patience with my busy schedule, help with household chores, and understanding of my frequent absences made possible my academic and career progress through the years. I am proud of you and the godly, strong, and talented young men you have become, and I know that each of you will go far in life. Finally, I acknowledge both my own and my husband’s parents for their encouragement and total absence of criticism for our non-conventional lifestyle, and of course for their ongoing practical support and all of the other ways that they have always been there for us. No one could ask for better parents.

Next, I owe a debt of gratitude to my professors for their dedication and enthusiasm for their professions, and particularly to accommodating me as a non- traditional student. I cannot count how many times a professor took the time and interest one-on-one to answer my many questions, arrange special scheduling, or explain difficult concepts. In undergraduate studies, two physics professors in particular stood out: Drs.

Richard Boyd and Richard Kass, and I recall our interactions on quantum physics and electronics with fondness and gratitude. Several gifted teachers who were particularly helpful to me in graduate coursework included Profs. Joseph Loon, Kurt Novak, Anton

Schenk, Christian Heipke, Clyde Goad, John Josephson, and especially Raul Ramirez as masters thesis advisor.

Several professors contributed to various phases of my dissertation research.

Whereas most dissertation students work closely with a single professor, I have benefited from rich interactions with professors in geodetic science (Drs. John Bossier, Raul

Ramirez, Bill Hazelton, C. K. Shum, Rongxing Li, and Ayman Habib), civil engineering

(Drs. John Lyon and Carolyn Merry), geological and polar studies (Drs. Kenneth Jezek and Ralph von Frese), geography (Dr. Joel Morrison) and computer science (Dr. Terrence

VI Caelli). Each contributed ideas, references, critiques, and/or encouragement toward what evolved into a highly complex, interdisciplinary topic.

Five advisors in particular stood out at different phases of the dissertation research. They are, in the order of their involvement: John Bossier, who helped establish the initial direction; Terry Caelli, who advised the early phases of the research; Ken

Jezek, whose patient listening, SAR/InSAR expertise, and enlightening comments opened the window to new discoveries; Bill Hazelton, whose “new paradigm” emphasis helped me to see beyond the current state of the art; and Raul Ramirez, who ushered me through the difficult final months. Also, many thanks to Joel Morrison and Ayman Habib for their review and helpful critique on the dissertation draft. I owe a special thanks to the

Office of Research (Drs. Edward Hayes and Linda Meadows) and the Center for

Mapping (Drs. Terrence Caelli and Joel Morrison) for partially supporting my research through funding and office space.

None of this would have been possible without the support and flexibility of my supervisors in 24 years of employment at Ohio State. Matey Janata was a master of encouragement in my early days and was delighted when I first returned to college in

1982. Dr. Robert Dixon of the Instruction and Research Computer Center encouraged my academic interests and professional development, allowing a flexible schedule to accommodate the needs of work, home, and studies. Later supervisors Del Waggoner,

Fran Blake, John Bossier, Terrence Caelli, Joel Morrison, and Jim Lee provided similar supportive environments. Further, I acknowledge with gratitude the overall climate at

Ohio State, which prizes the pursuit of knowledge, encourages employee development, and sets no limits based on gender or standing in life.

VII Finally, I mention my friends. Personal friends Claudia Cook, Melanie

Hockenberry, and Patty Perkins have been there with me throughout the years, encouraging and patient, and refraining from asking too many times, “so, are you done yet?” Later encouragement from Mary Wallake, Mary Atzenhoefer, and Chris Putnam helped pull me through the tough times. Many other friends helped to bring me through courses that seemed, at times, beyond my threshold of comprehension and endurance.

Those who provided special and unfailing support at difficult moments included Ed

Oshel, Angela Mallett, and Morelia Arrieche.

The large debt that I owe to these and many others can never be repaid. It has been said, “You can’t pay back; you can only pay forward.” I can only hope to pass on to others as much as I have been so richly given.

Vlll VTTA

June 18,1955 ...... Bom — Columbus, Ohio

EDUCATION

1989 ...... B.S. Engineering Physics Specialization in Electrical Engineering The Ohio State University

1993 ...... M.S. Geodetic Science and Surveying Digital Mapping Track The Ohio State University

WORK EXPERIENCE RELATED TO DEGREE

Center for Mapping, The Ohio State University

1988-1994 ...... Computer Specialist

1994-1998 ...... Assistant Director

1998-2000 ...... Research Associate 2

JOURNAL PUBLICATIONS

“Fundamentals of Large-Format Scanning Technology,” Stevenson, Paula J., Surveying and Land Information Systems, Vol. 52, No. 3, 1992, pp. 163-168.

List of other (non-joumal) publications available upon request.

FIELDS OF STUDY

Major Field; Geodetic Science and Surveying

ix TABLE OF CONTENTS

Page

Abstract...... ii

Dedication ...... iv

Acknowledgments ...... v

Vita...... ix

List of Tables...... xv

List of Figures...... xvii

Chapters:

1. Introduction...... 1

1.1 Introduction, History, and Uses of S AR/lnS AR ...... I 1.2 SAR and DiSAR: First Principles ...... 3 1.3 Rationale for Topic Under Investigation ...... 8 1.3.1 Migration from Research Studies to Operational Mapping ...... 8 1.3.2 SAR/lnSAR Thematic Mapping: Still Stalled at the Experimental Stage ...... 9 1.4 The Approach Part One: Standards for Uncertainty in Measurement ...... 12 1.4.1 Development of New Standards ...... 15 1.4.2 Standard Terminology ...... 16 1.4.3 Reporting Requirements ...... 19 1.5 The Approach Part Two: Applying GUM to the Chain of SAR/lnSAR Processes...... 21 1.6 Expected and Potential Outcomes ...... 25 1.7 Footnotes and References ...... 26 2. The Scientific Method ...... 33

2.1 Paradigms, Normal Science, and Theories ...... 35 2.2 Designing a Scientific Experiment ...... 38 2.3 Scientific Integrity ...... 41 2.4 SAR/InSAR Classification Practices: Science or Pseudo-Science?...... 43 2.5 Critique on Current Paradigm ...... 43 2.5.1 Cartographic Map Generalization ...... 46 2.5.2 Remote Sensing Classification Paradigm ...... 47 2.5.3 Machine Learning Paradigm ...... 48 2.6 Critique on SAR/InSAR Classification and the Scientific Method ...... 49 2.7 Critique on SAR/InSAR Classification and Scientific Integrity ...... 52 2.8 Summary...... 52 2.9 A Case Study ...... 53 2.9.1 The Methodology ...... i...... 54 2.9.2 The Results ...... 56 2.9.2.1 Limiting Factor 1: Poor Satellite Data Quality ...... 56 2.9.2 2 Limiting Factor 2: Variances in Classification ...... 57 2.9.2 3 Methods and hiterpreter Skills ...... 57 2.9.2.4 Limiting Factor 3: Limitations in Reference Data ...... 58 2.9.2 5 Limiting Factor 4: Resource Limitations ...... 58 2.9.3 General Observations by Authors ...... 59 2.9.4 IGBP Future Plans...... 60 2.9.5 An Independent Critique of the Global Mapping Project ...... 61 2.10 References...... 62

3. Review of Accuracy Standards ...... 65

3.1 General Measurement Standards ...... 66 3.1.1 Guide to the Expression of Uncertainty in Measurement ...... 67 3.1.2 ISO 5725-2 Accuracy of Measurement Methods and Results Part 2: Repeatability and Reproducibility ...... 73 3.1.3 NIST Technical Note 1297 — Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results ...... 75 3.1.4 ISO TC69/SC6/WG7 — Statistical Assessment of the Uncertainty of Measurement Results...... 77 3.2 Geospatial Data Standards ...... 82 3.2.1 Spatial Data Transfer Standard ...... 83 3.2.2 Content Standard for Digital Spatial Metadata ...... 86 3.2.3 Content Standard for Digital Spatial Metadata and Extensions for Remote Sensing Metadata ...... 88 3.2.4 ISO/TC211 Draft Standard for Geographic Information/ Geomatics...... 90 3.3 S ummary and Observations...... 95 3.4 Notes and References ...... 98

xi 4. The New Methodology ...... 100

4.1 Determining the Measurand ...... 101 4-2 Scales of Measurement...... 103 4.3 Evolution of Nominal Statistics in Remote Sensing Thematic Maps 104 4.4 Comparison of Categorical (Nominal) and Interval Statistics ...... 105 4.5 Illustration of Quantitative vs. Qualitative Statistics...... 106 4.6 Other Sources of Uncertainty...... 113 4.7 The New Methodology ...... 115 4.8 Adapting Uncertainty Measures in Chemical Analysis to Thematic Mapping ...... 117 4.9 Discussion of Calibration Standards and Traceability ...... 119 4.10 References...... 121

5. Uncertainty in Sensor-Scene Interactions...... 123

5.1 Analysis of Uncertainty of the Radar Point Equation ...... 124 5.2 Analysis of Uncertainty in the Radar Cross Section ...... 131 5.2.1 Analysis of Uncertainty in Speckle...... 131 5.2.2 Analysis of Uncertainty in Surface Roughness...... 135 5.2.3 Analysis of Uncertainty of Local Incidence Angle ...... 140 5.2.4 Analysis of Uncertainty of Dielectric Properties ...... 146 5.3 Analysis of Uncertainty in Other Factors ...... 149 5.3.1 Analysis of Uncertainty in Polarimetry...... 149 5.3.2 Analysis of Uncertainty of Scene-Sensor Response in Interferometric SAR...... 155 5.3.2.1 Uncertainties in Airborne InSAR...... 157 5.3.2.2 Additional Uncertainties in Repeat-Pass InSAR ...... 158 5.3.2.3 Further Considerations in InSAR ...... 159 5.3.3 Analysis of Uncertainty of Correlation Data in Repeat-Pass Interferometry...... 161 5.3.4 Description of Other Effects ...... 162 5.4 Summary of Uncertainties in Sensor-Scene Interactions...... 166 5.5 Note and References ...... 167

6. Uncertainty in Antenna/Receiver and Processor Functions...... 171

6.1 Uncertainty Due to Receiver Noise...... 173 6.2 Uncertainty Due to Doppler and Range Ambiguities ...... 174 6.3 Uncertainty Due to Radiometry and Geometry ...... 178 6.4 Other Uncertainties in SAR Processes ...... 182 6.5 Uncertainties in InSAR Processing ...... 183 6.6 Summary of System and Processing Effects ...... 187 6.7 References...... 188

XU 7. Uncertainty in Classification Models and Algorithms ...... 191

7.1 Problems in the Current Modus Operandi and Examples ...... 193 7.2 Applying the Methodology to the Determination of Uncertainty in Classification Models and Algorithms ...... 199 7.3 Applying the Methodology to Determine Uncertainty in a Sample Case ...... 200 7.4 References...... 204

8. Uncertainty in Map Output Requirements...... 208

8.1 The Evolution of Standards in GIS Databases ...... 210 8.2 The National Spatial Data Infrastructure ...... 216 8.3 Evaluation of Uncertainty in the Vegetation Classification Standard 218 8.4 Summary...... 225 8.5 Note and References ...... 226

9. S ummary and Conclusion...... 227

9.1 General Summary...... 227 9.2 Applying the New Methodology to Thematic Map Classification of SAR/InSAR Data...... 230 9.3 Conclusion...... 239 9.4 Questions for Future Research...... 240 9.5 Note ...... 241

APPENDICES

Appendix A: Paradigms Adapted for SAR/InSAR Classification ...... 242

A.l Paradigm 1: Cartographic Map Generalization ...... 243 A.2 Paradigm 2: Remote Sensing Image Classification ...... 249 A.2.1 Supervised Classification Techniques ...... 251 A.2.2 Unsupervised Classification ...... 252 A.2.3 Accuracy Assessment...... 253 A 3 Paradigm 3: Machine Learning ...... 258 A.3.1 The Problem of Accuracy Assessment...... 263 A.4 General Observations...... 265 A. 5 The Problem of Classifications ...... 266 A.6 Conclusion...... 269 A.7 References...... 269

X llI Appendix B: Discussion of Issues Raised in This Investigation ...... 271

B.l The Normal Distribution...... 271 B.2 Spatial Averaging in SAR ...... 272 B.3 Backscatter...... 273 B.4 The Measurand Revisited...... 277 B.5 The Measurand and Preliminary Methodology ...... 281 B.6 Additional Paradigms to Consider ...... 287 B.7 References...... 295

BIBLIOGRAPHY...... 297

XIV LIST OF TABLES

Table Page

3.1 Summary of Procedure for Evaluating and Expressing Uncertainty ...... 70

3.2 Data Quality Elements and Sub-Elements...... 91

3.3 Level 1 Components of Imagery and Gridded Data, ISO 19124 ...... 94

4.1 True DEM V alues. Measured V alues, and Differences...... I l l

4.2 Summary of New Methodology ...... 116

5.1 Overview of Uncertainty Analysis in Sensor-Scene Interactions...... 125

5.2 Dependence of Dielectric Constant on Frequency ...... 147

5.3 Sources of Vertical and Horizontal Uncertainty in the Dual-Antenna Airborne Case...... 157

5.4 Summary of Uncertainties in Sensor-Scene Interactions...... 166

6.1 Overview of Uncertainty Analysis in Antenna/Receiver and Processor Functions...... 172

6.2 ERS-1 SAR Radiometric Uncertainty ...... 180

6.3 Summary of Major System and Processing Effects on the Uncertainty of SAR/InSAR Measurements...... IBS

7.1 Examples of Data Products Derived from the Basic SAR Signal ...... 193

7.2 Analysis of Study to Determine Building Footprints ...... 196

7.3 Analysis of Study to Determine Crop and Soil Conditions ...... 197

7.4 Analysis to Study to Classify Vegetated Areas ...... 198 XV 7.5 An Example Methodology for Establishing Repeatability and Reproducibility in Model and Algorithm Classification ...... 201

7.6 Overview of Uncertainty Analysis in Classification Models and Algorithms ...... 202

8.1 Feature Class Descriptors for Three Hypothetical Databases ...... 213

8.2 Example of One Class, Subclass, Group, Subgroup, and Formation Of the National Vegetation Classification Standard ...... 219

8.3 An Example Methodology for Establishing Repeatability and Reproducibility for Map Classification Standards ...... 222

8.4 Overview of Uncertainty Analysis in Map Classification Output Requirements...... 223

8.5 Classification Confidence Levels for Associations ...... 224

A. 1 Suggested Maximum Scales of Photographic Products as a Function of Effective Ground Pixel Size...... 246

A.2 Types of Machine Learning Systems ...... 260

XVI LIST OF FIGURES

Figure Page

1.1 Components of an electromagnetic wave ...... 4

1.2 Polarization ellipse ...... 5

1.3 SAR platform/scene configuration and effect on resultant image range ...... 5

1.4 InSAR platform configuration and geometric relationships ...... 8

1.5 A depiction of accuracy and precision ...... 13

1.6 Four basic processes in generating landscape maps from SAR/lnSAR ...... 22

1.7 Flowchart of SAR processor functions ...... 24

2.1 The scientific method, through paradigms that direct normal science, offers a framework for the development of theories and scientific experiments ...... 34

2.2 Typical methodology for single-use “disposable” classifiers ...... 44

4.1 The real world, representation model, and measurement model ...... 102

4.2 Real world and ideal model of world as DEM (measurand) ...... 107

4.3 Reconstructed world with error and DEM as measured ...... 107

4.4 Scatterplot of matched pair differences between “truth” grid cells and “measured” grid cells in Figures 4.2 and 4.3 ...... 112

4.5 Error matrix/truth table for comparing the truth with the measured quantity...... 112

5.1 Decibel Conversion Chart ...... 126

5.2 Relative theoretical uncertainties for the radar point equation ...... 129 xvii 5.3 Complex linear superposition of individual scattering centers produces speckle...... 133

5.4 Single-look SAR image, showing evidence of speckle ...... 133

5.5 A SAR image after multilooking (averaging) applied ...... 134

5.6 Model of surface roughness ...... 136

5.7 Relative theoretical uncertainties for the surface roughness equation 139

5.8 Relationship of and local terrain angles ...... 140

5.9 Effect of surface roughness and local incidence angle ...... 143

5.10 Relative theoretical uncertainties for the equation involving the local incident angle ...... 146

5.11 The relationship between the measured dielectric constant and the volumetric soil moisture at 1.4 GHz...... 149

5.12 The figure on the left is a cross-polarized response, while the figure on the right is a co-polarized response for a grass surface ...... 153

5.13 Optimum resolution for minimum height uncertainty, depending on the slope of the ground ...... 156

5.14 Atmospheric humidity and pressure effects on the phase angle ...... 159

5.15 Bragg scattering occurs when features are aligned along the phase front of the signal ...... 164

5.16 Trihedral reflector, used along with dihedral reflectors as ground control points ...... 165

6.1 The terrain point is localized in two dimensions at the intersection of the Doppler shift in azimuth and the time delay in range ...... 176

6.2 Wavefront curvature in the range/Doppler domain ...... 183

6.3 The motion track of airborne interferometers need to be compensated, In order to obtain more accurate positions in the final product ...... 187

9.1 Four basic processes in generating landscape maps from SAR ...... 228

xviii 9.2 Summary of steps for determining the final uncertainty of measurements. ..231

9.3 Reformulated end-to-end model for thematic maps in production mode.... 237

A.I McMaster and Shea’s Raster Generalization Operators ...... 247

A.2 Confusion (error) matrix for assessing classification accuracy...... 254

A 3 Observers classify objects differently ...... 267

B.I Comparison of simple averaging with linear superposition ...... 273

B.2 Comparison of world and SAR representation of the world ...... 278

B.3 Restricting the domain offers a greater probability to infer the scatterer properties and characteristics from the signal ...... 280

B.4 Comparisons between map classification and determining constituency Of chemical compounds ...... 294

XIX CHAPTER 1

INTRODUCTION

1. Background

1.1. Introduction. History, and Uses of SAR/InSAR

RADAR is an acronym for RAdio Detection And Ranging. Radars operate in the microwave part of the electromagnetic (EM) spectrum, above the visible and thermal infrared regions. Radars generally include coherent wavelengths from 1 mm to 1 meter.[l] The advantage of operating in the microwave portion of the spectrum is that the earth’s atmosphere is nearly transparent to the sensor, allowing signal penetration of clouds, smoke, haze, and most precipitation and weather conditions as if they do not exist. Radars are active sensors, transmitting and receiving a signal of electromagnetic energy in a way that is independent of the time of day (or night) and the sun’s illumination.

The history of synthetic aperture radar (SAR) began in the late 19* century, when

Heinrich Hertz first demonstrated that near-microwave (radio) waves could be used to generate reflections of objects. In 1904 Huelsmeyer obtained the first patent to detect ships using radar. However, the first practical use of radar did not come about until the

1930’s, when R.M. Page demonstrated the first airborne pulsed radar system. Military use dominated radar development until after World W ar II, when civilian researchers began investigating imaging radar systems for geoscience applications and its value in 1 this arena was established. In the early 1950’s, engineers recognized that instead of rotating an antenna to scan the target area, it could be fixed to an aircraft fuselage. The concept of side-looking airborne radar (SLAR) was soon implemented using real aperture imaging radars, which define azimuth resolution according to the beam width of the antenna. [2,3]

Simultaneous with the development of SLAR, it was observed that the Doppler frequency could be used to perform spectral analysis, supporting a high azimuthal spatial resolution that remained constant as a function of range. In 1971 these concepts were implemented in a practical way, using an innovative optical processor that employed the concept of a Fresnel zone plate. Several civilian airborne SAR systems were sponsored by NASA in the late 1960’s and early 1970’s, primarily built by the Environmental

Research Institute of Michigan (BRIM) and the Jet Propulsion Laboratory (JPL). But the systems proved to have severe limitations, so in the late 1970’s the same optical model was implemented using digital processing techniques based on Fourier transforms.

Spacebome SAR missions were carried out in 1978 (SeaSAT), followed by the

Shuttle Imaging Radar (SIR) series of flights in the 1980’s and early 1990’s. Three planetary SAR missions to map cloud-covered Venus were carried out in 1978 (Pioneer),

1983 (Venera 16, USSR), and Magellan (1990). The Cassini mission followed to map

Saturn and Titan, which have optically opaque atmospheres. Although the basic technology was developed in the U.S., airborne and spacebome SAR missions have also been carried out by Canada, Denmark, France, Germany, Japan, China, and other nations.

[4,5] The technology and data quality continued to improve in the 1990’s and the use of

SAR for operational mapping began increasing in the late 1990’s.

2 Satellite-based interferometric SAR applications were first used in the early

1970’s to generate topographic contours of the Moon and Venus, while terrestrial airborne SAR interferometry was first reported by Graham in 1974. More recently,

InSAR (as it came to be later known) has come into use for studying dynamic, geologic phenomena such as earthquake displacement and glacial movement.[6]

Many different platform and processing configurations are used in SAR and

InSAR, which allow particular, targeted aspects of ground and/or ocean features to be accentuated and optimized. Its usefulness has been demonstrated for specific applications in defense, agriculture, forestry, geology, hydrology, sea ice, ocean, land use, and other areas.

1.2 SAR and InSAR: First Principles

SAR is a well-behaved, coherent pulsed signal that can be represented by a spherical wavefront. The signal can be approximated locally as an electromagnetic, polarized plane wave represented by the expression

E = Eo expiifiX-Kz), where E is the complex electric field at a given time t and location z from the initial field

Eo- Here o) = 2%f = 2t i / t , where the period t = l/f, k is the wave number = ale = 2idX, and A is the wavelength of the coherent signal. Figure 1.1 illustrates the components of a basic electromagnetic wave. The three main attributes are manipulated and combined firom one or more scenes to produce a variety of outputs, such as a magnitude image, interferogram, coherency image, Stokes matrix, and others. As a range/Doppler radar, the range coordinates are proportional to the time delay, while the azimuth coordinates are derived from the Doppler frequency gradient across the scene. E (Bectcic Rèfci) rtTlv

H (Magnetic Field)

Figure 1.1 Components of an electromagnetic wave. The plane of polarization is defined by the electric field E. [from 7] The return waveform itself is comprised of an amplitude, phase, and polarization ellipse (see Figure 1.2).

Figure 1.3 shows the effect of the time delay dependency on the image range, which

produces distortion in the image seen as foreshortening on front-facing slopes and as

shadowed areas (no data) on rear-facing slopes.

The strength of the return signal in terms of its major parameters is given by the

well-known radar equation. Although many variations of this equation are in use, a

conunon one is expressed in terms of the total power received at slant range R [10]:

Pr = PTO[G^/?]/(4nfRf. [Variables will be defined in Chapter 5.]

S AR data can be combined with InS AR to provide additional information for

image classification [11,12,13]. InSAR (e.g., from a single-pass, dual-antenna

configuration) can help distinguish classes by providing detailed elevations across a

scene. (Note to reader: the terms 'features" and “classes” are used interchangeably throughout this document.) Areas with similar elevations can be distinguished from those with dissimilar eievations. Feature elevations can range from very low (i.e.. MINOR AXIS

POLARIZATION ELLIPSE

Figure 1.2 Polarization ellipse, (from Ulaby [8])

Shadow 5 D istortion / Im age

Figure 1.3 SAR platfonn/scene configuration and effect on resultant image range (from Raney [9]) blacktop, concrete), to low (grassland and low crops), to moderate (bushes, tall crops, and small trees), to moderately high (trees, houses), to high (large trees, high rock outcrops, higher buildings). The simultaneous use of several different wavelengths (X, C, L, etc.) in InSAR mode can help identify features within parts of the elevation range from “very low” to “high”, since some features can be more readily distinguished in one wavelength than another. The spatial distribution of elevations across a scene can offer further information helpful in classification: for example, a rapidly-repeating change in feature height from flat surface to grass, tree, and rooftop may identify an urban area.

Further, differences in elevation over time can help to distinguish classes. For example, suppose an InSAR image taken in early spring shows a series of large, rectangular flat regions. In mid-summer an InSAR image is taken of the same area again.

Differences in height can indicate crop growth and therefore designate an agricultural area, which was bare on the first pass, but on the later pass revealed different heights of crops from field to field (for example, soybeans develop into a low cover crop, while com grows into a relatively tall crop).

Another way that InSAR can aid in identifying feature changes over time is through the use of coherency information. A coherency map from multipass InSAR (e.g., derived from a dual-pass, single-antenna configuration) can identify temporal decorrelation—that is, small changes in elevation height or scattering characteristics from one time to another [14,15].

InSAR is obtained by differencing the phase signature of at least two SAR images of the same ground area, taken from slightly different angles and/or at different times

(similar to stereo imagery). Figure 1.4 shows the most basic configuration, where Ai and

6 . Az are the antenna locations, B is the baseline between the two platform positions, h is the sensor height of the first antenna from the geoid, z(y) is the ground elevation from the geoid, 0 is the depression angle, p is the slant range from the first antenna to the image point, and p + zip is the slant range from the second antenna to the same image point.

Using simple geometry, the following relation is obtained:

z(y) = h - pcosG

Applying the law of cosines and algebraic manipulation yields the expression:

sin(a - 6 ) = [(p + A p f - p^ - B^yipB

The measured quantity is the phase difference between the two antennas, giving

5p = ?iq>l2n where k(p is the ambiguity in range. The unknown topographic height z(y) is then given by combining these expressions and obtaining:

2 - r - z{y) = h - CO S0 2 ^ s in ( a - 0 )

It should be noted that InSAR by itself yields only relative height differences, and control points are needed to establish absolute elevations to which the relative heights can be referenced. Phase unwrapping is a process that is typically performed in an attempt to minimize the modulo 2n phase ambiguities. 0 y

Figure 1.4. InSAR platform configuration and geometric relationships [from 16]

1.3 Rationale for Topic under Investigation

1.3.1 Migration from Research Studies to Operational Mapping

The development of basic SAR/InSAR technologies and the era of experimental research are approaching maturity, as usage migrates from experimental studies to operational missions. Some program managers at the Defense Advanced Research

Projects Agency (DARPA) assert that SAR offers substantive improvements over optical methods because SAR’s active, coherent signal returns a richer response of amplitude, phase, and polarization, under nearly all conditions [17] (whereas optical methods offer only a magnitude response under varying illumination conditions), and they are consequently funding many efforts. In the civilian arena, an interagency ad-hoc working group recently speculated that operational use of SAR will dominate over electro-optical 8 sensors in the future [18]. The migration in funding trends away from experimental

research studies is underscored by the casual observation by well-known JPL scientist Dr.

Anthony Freeman that funding for research studies has declined substantially in the last

several years \

Evidence for increased operational usage can be observed in a variety of venues.

In the decade of the 90’s, “many billions of dollars were spent worldwide on SAR

satellites designed for civilian remote sensing. Many hundreds of millions of dollars

[have gone] into aircraft SAR facilities, and many millions more into image processing.”

[19] Increasingly, SAR defense missions sponsored by the Department of Defense

(DoD) are being carried out by private industry, as evidenced by a number of recent,

published solicitations in the Commerce Business Daily^.

Specific operational applications are increasing in number, as supported by the following examples. In a major operational undertaking, a near-global Digital Elevation

Model (DEM) is nearing completion as processed data are released from NASA/NIMA’s

Shuttle Radar Topography Mission (SRTM). A detailed, continental map of Antarctica is nearing completion using SAR/InSAR data [20]. Finally, InSAR data was recently used to satisfy an operational requirement by the National Oceanic and Atmospheric

Administration (NCAA) to generate low- and high-water maps of cloud-covered coastal areas of Alaska [21].

1.3.2 SAR/InSAR Thematic Mapping: Still Stalled at the Experimental Stage

A major area of experimental research study over the past 3 decades has been the classification of SAR/InSAR data into thematic maps delineating types of terrain, such as forest, cropland, urban areas, etc. Thematic maps can either be an end product, or they

9 can provide a background for planimetric and/or topographic maps. In these more

complex products, detailed spatial information (e.g., control points, road route numbers,

contour lines, building outlines, etc.) may be superimposed over the thematic map.

In order to keep the scope of the proposed effort manageable, we limit our study

to the use of SAR/InSAR data for thematic mapping only. In the scientific literature,

literally thousands of published papers describe different types of thematic mapping

studies, demonstrating the importance of this topic to the mapping community. These

studies employ a wide variety of classification methods and types of thematic maps using

different SAR platforms, sensing methods, wavelengths and polarizations, models, and

classification algorithms [22].

Following on to these studies, some at NIMA conjecture that SRTM data (the

same data that are being used to generate the near-global DEM) may also be useful as the

primary data source for generating reliable, combined planimetric/topographic map

series, classihed according to feature type^. Such a series presumes the use of thematic

maps generated from SAR/InSAR as a background. Private industry sources such as

Vexcel, Inc., bolster NIMA’s conjecture with public statements such as, “with higher

resolution InSAR data (better than 3 m), it will be possible to automatically generate

accurate, high quality, fully symbolic topographic maps using InSAR data sets as the sole

input... [A recent Vexcel study] determined that InSAR is capable of producing

commercial quality maps at a greatly reduced cost. [23]” However, these claims have not

been substantiated by independent sources'*.

Although operational uses for SAR/InSAR are finding their way into specific niches, very real questions remain as to whether these data can, in fact, be used to

10 generate thematic map series automatically and reliably, and if so, to what extent, under what conditions, and to what accuracies. The answers remain elusive because published one-of-a-kind classification studies have examined only narrow aspects of SAR/InSAR processing and applications, on mostly individual scenes. Studies are dominated by single-use classifiers that cannot be repeated on other scenes without extensive (and often labor-intensive) experimentation, tuning of parameters, and often fresh ground tmth. In a typical experiment, parameters of a particular classification method are painstakingly tuned to a particular scene in a way that minimizes the classification error in that scene.

However, if the same finely-tuned method is applied to another scene with similar features (or even to the same scene at a different time or a slightly different incident angle), the results can be quite different with a far higher degree of classification error.

Thus, further automation is limited by the approach taken, causing this expensive and time-consuming mode of classification to be scarcely practical in a production setting.

Based on largely individual scenes and optimization processes that tend to be manually- guided, the methodology of “disposable classification” hinders a build-up of knowledge to support further automation.

To date there has not been even a single, overarching study that has examined the linked, complex and multi-faceted issues that would be involved in using SAR/InSAR data for producing a consistent thematic map series. Indeed, even a basis for conducting such a study has not been determined. We surmise that the lack of a strong theoretical foundation in this area may be the primary reason why such a study has not been undertaken to date. One obvious reason for the limitations of this current practice (or paradigm) when applied to a map series is the fact that the process of SAR/InSAR

11 classification is not based on objective standards. It follows that this practice cannot therefore be used to produce a map series that must, by its very nature, follow a set of consistent standards. We also suggest that the quality of uncertainty statements in

S AR/LiS AR thematic classification may be deficient, because they do not consider a number of dominant factors that contribute to the error (to be discussed later). By not incorporating these factors, the usefulness of results are limited and improvements in the classification process are hindered.

This situation suggests the need for a so-called paradigm shift: the exploration of a topic from an entirely new angle. Such a shift often suggests new avenues of inquiry; new quantities to measure; and new relationships among variables. To begin our shift towards a novel outlook, westep outside the current paradigm and methodology, and simply treat SAR/InSAR measurements as we would any other scientific measurement.

1.4 The Approach Part One: Standards for Uncertainty In Measurement

A fundamental tenet of science is that a valid experiment must be repeatable and reproducible, to within a probable range of accuracy and precision. It is generally agreed that the usefulness of measurement results is largely determined by the quality of the statements of uncertainty that accompany the measurements [24]. Any experiment that cannot be suitably duplicated is normally discredited as invalid. The following example illustrates this principle:

[In the early 1990’s] researchers Martin Fleischmann and Stanley Pons, then both at the University of Utah, made headlines around the world with their claim to have achieved fusion in a simple tabletop apparatus working at room temperature. Other experimenters failed to replicate their work, however, and most of the scientific community no longer considers cold fusion a real phenomenon. [25]

12 Basic to this principle is the notion that the quantity to be measured, known as the measurand, is an absolute quantity “known only to God” [26]. It cannot be known by us with total confidence due to deficiencies and errors in the measurement process. To be considered valid, a measurement must be accompanied by a statement of “how close” it is likely to be to the measurand. A short definition of the other terms (shown in italics above) follows:

• Repeatable measurements are those “carried out within a single laboratory by one

operator using the same equipment throughout” [27].

• Reproducible measurements are those “obtained with the same method on

identical test material but under different conditions (different operators, different

apparatus, different laboratories, etc.)” [28].

• Precision is a statistical measure of repeatability, usually given as the variance or

the standard deviation of repeated measurements [29].

• Accuracy is the degree to which a measurement is known to approximate a given

value [30].

Precision and accuracy are often explained by the use of a “bulls-eye” pattern as shown in Figure 1.5. The exact center of the bulls-eye represents the measurand, and the scattered dots represent inexact measurements that approximate the measurand.

precise but inaccurate accurate but imprecise accurate and precise

Figure 1.5. A depiction of accuracy and precision. 13 It is interesting to note that the terms measurand, repeatability, and reproducibility are rarely (if ever) found in the SAR/InSAR classification literature. In fact, the concepts that a measurand can theoretically be a standard, fixed quantity, and that measurements should ideally be repeatable and reproducible are not even mentioned! The basic classification and accuracy assessment methods in use today were developed empirically in the late 1970’s and early 80’s in remote sensing studies, and the same methods were then applied to SAR/InSAR. In general, these methods were intended for use within self-contained, individual scenes, and the classification and accuracy assessment were to be performed independently for each scene. A few notable exceptions to the independent, scene-by-scene classifications were some physics-based models such as JPL’s Vegmap [31]; Ulaby's polarimetric models [32]; and vegetation

[33] and soil moisture [34] models. However, even these exceptions offered inconclusive results in terms of ambiguous signal responses of some of the classes from scene to scene. The basic methods mentioned above (with some improvements) still remain in wide use today. The classification accuracy continues to be represented by means such as a truth table, which compares sampled areas of the thematic map with ground truth.

The current situation begs the question: in overlooking key principles for scientific measurements in general, are thematic classifications as they are practiced today subject to the same criticism as the cold fusion experiment^ Experiments to date on SAR/InSAR data have not examined this issue, nor (as far as we know) have experiments been constructed to study the repeatability or reproducibility of measurements. Our studies to date suggest that the fundamental principles of measurand.

14 repeatability, and reproducibility are the key issues in determining the ability to ramp up experimental studies to mass production of thematic map products from SAR/InSAR data.

In light of these perspectives, this effort proposes to develop a theoretical basis and general methodology to enable reliable and consistent production of thematic map series from SAR/InSAR data, by applying the concepts of measurand, repeatability, and

( of special interest) reproducibility. If such a theory and methodology could be successfully developed, they could potentially save millions of dollars in misdirected studies and projects, and serve to focus efforts in classification with a greater likelihood of achieving repeatable and reproducible results that are both accurate and precise to within acceptable limits.

1.4.1 Development of New Standards

In the time since the basic remote sensing methods were developed and have entered standard practice, intemational efforts have been underway throughout the 1990’s to standardize the methods by which any type of measurement uncertainty is presented.

These efforts were presented to the intemational community in the Guide to the

Expression o f Uncertainty in Measurement (GUM)\_35]. The Guide was published in

1993 (updated in 1995) by the Intemational Committee for Weights and Standards

(CIPM), with involvement and endorsement from several intemational bodies: the

Intemational Bureau of Weights and Measures (BIPM), the Intemational Electrotechnical

Commission (lEC), the Intemational Organization for Standardization (ISO), and the

Intemational Organization of Legal Metrology (OIML). The movement toward an intemational standard has been driven largely by a global economy and marketplace, and

15 its adoption is intended to allow measurements made in different countries in areas as

diverse as science, engineering, commerce, industry, and regulation to be more easily

understood, interpreted, and compared.

The GUM approach was adopted by the U.S. National Institute of Standards and

Technology and re presented in succinct form in 1994 as Technical Note 1297 of the U.S.

National Institute of Standards and Technology (NIST), Guidelines for Evaluating and

Expressing the Uncertainty of NIST Measurement Results [36]. NIST has put forth these

standards, not only for the guidance of NIST measurement results, but also for

measurement results associated with basic research, applied research and engineering,

calibration and certification, and other uses. Our research approach proposes to apply GUM/NIST guidelines on repeatability and reproducibility to the thematic classification of SAR/InSAR data.

This study will be augmented by an analysis of several recent and pending geospatial data standards. The geospatial standards will be briefly described, then examined and critiqued in light of GUM/NIST concepts.

1.4.2 Standard Terminologv

Although a number of terms are presented in the guidelines, three terms are important for our purposes: measurand, repeatability, and reproducibility. Since these concepts are critical to our discussion, their meaning and usage is expanded below.

The measurand is the value of the specific quantity subject to measurement. The result of a measurement is only an approximation or estimate, which is only complete when accompanied by a quantitative statement of its uncertainty. The estimated value of the measurand is defined by a standard method of measurement. In S AR/biS AR scene

16 classification, an important measurand quantity is the actual class as defined, on or near

the earth’s surface (such as white pine forest). The Technical Note recommends that all

parameters upon which the measurand depends be varied to the fullest extent practical, so

that the evaluations are based as much as possible on observed data. Whenever feasible,

the use of empirical models of the measurement process (founded on long-term

quantitative data, and the use of check standards and control charts that can indicate if a

measurement process is under statistical control) should be part of the effort to obtain

reliable evaluations of components of uncertainty [37]. In general, the error measurement

is unknown because the value of the measurand is unknown. However, the uncertainty of

the result of a measurement may be evaluated.

The repeatability (of results of measurements) is the closeness of the agreement

between the results of successive measurements of the same measurand, carried out under the same conditions of measurement. The repeatability conditions include:

• The same measurement procedure

• The same observer

• The same measuring instrument, used under the same conditions

• The same location

• Repetition over a short period of time.

Repeatability may be expressed quantitatively in terms of the dispersion characteristics of the results. In the case of SAR/InSAR image classification, this implies, in general, repeated use of the same sensor, procedures, processing equipment, operator, and laboratory, to process an image of the same terrain (i.e., scene) several times, with as short an interval as possible between the times of performing the image sensing and

17 classification. Each image is processed in a sinailar fashion, then the results are compared and averaged. The essence of repeatability is determining the extent of variability in the imaging, processing, and classiEcation activities, by repeating the same process a number of times. In this context, repeatability is similar to measuring the length of a ruler a number of times, wherein each time the measurement will vary slightly

(within the precision of the measuring instrument). By holding all factors as constant as possible, and repeating the same process, the consistency of the process itself and its influence on the results can be isolated. The final result of a measurement (depending on the type of measurement sought), for example, may be the mean and standard uncertainty

(e.g., the estimated standard deviation, equal to the positive square root of the estimated variance).

The reproducibility of the results of measurements is the closeness of the agreement between the results of measurements of the same measurand carried out under changed conditions of measurement. A valid statement of reproducibility requires specification of the conditions changed, which may include:

• Principle of measurement

• Method of measurement

• Observer

• Measuring instrument

• Reference standard

• Location

• Conditions of use

• Time.

18 Reproducibility may be expressed quantitatively in terms of the dispersion characteristics

of the results. In terms of SAR/InSAR classification, reproducibility means several

different labs process the same dataset to generate the same categories of features,

although different parts of the process may vary. Reproducibility is a useful measure, as

it assesses the range of uncertainty when various factors in the process are changed.

1.4.3 Reporting Requirements

Standard reporting methods are prescribed in the guidelines. The report should

include the standard uncertainty «,•; either the combined standard uncertainty Uc or the

expanded uncertainty U (along with coverage factor k and how its value was chosen); and

a list of all components of standard uncertainty along with their degrees of freedom

(where appropriate). For each component of standard uncertainty, the method used to estimate its numerical value should be stated, as well as a detailed description of how each component was evaluated. It is often desirable to provide a probability interpretation, such as a level of confidence, for the interval defined by U or «c, along with the basis for the statement.

Finally, the uncertainly depends not only on the repeatability and reproducibility of the measurement results, but also on how well one believes the standard measurement method has been implemented. The particular method of measurement should be clearly indicated and described when necessary.

A useful model for applying the general guidelines of GUM to a specific discipline can be found in the Guide to Ouantifving Uncertaintv in Analvtical

Measurement 1381. Even though the document addresses a different discipline

(chemistry), the underlying problems and measurement concepts are quite similar. This

19 particular guide is offered for use in analyzing the results of chemical quantitative analysis. The foreword makes the following statements, which can be echoed in the problems seen in thematic classifications of SAR/InSAR:

...Whenever decisions are based on analytical results, it is important to have some indication of the quality of the results, that is, the extent to which they can be relied on for the purpose at hand. Users of the results of chemical analysis, particularly in those areas concerned with intemational trade, are coming under increasing pressure to eliminate the replication of effort frequently expended in obtaining them. Confidence in data obtained outside the user’s own organization is a prerequisite to meeting this objective...In analytical chemistry, there has been great emphasis on precision of results obtained using a specific method, rather than on their traceability to a defined standard.. .This EURACHEM document shows how the concepts in the ISO Guide may be applied in chemical measurement.

The proposed effort will follow an approach similar to the chemical guide by expanding on and applying the following topics to the area of SAR/InSAR thematic classification:

• Scope and field of application

• Uncertainty

• Analytical Measurement and Uncertainty

• The Process of Measurement Uncertainty Estimation

Step One: Specification

Step Two: Identifying Uncertainty Sources

Step Three: Quantifying Uncertainty

Step Four: Calculating the Combined Uncertainty

Reporting Uncertainty

In this way, a methodology will be prescribed to establish the total confidence of measurement results, traceable to established standards (discussed below) and/or

20 accepted empirical methods. All of the sources of uncertainly will be considered for the end-to-end operations described in the following section, to the extent possible.

However, most effort will focus on those factors that contribute the greatest uncertainty, since altogether they present a good estimate of the total uncertainty.

1.5 The Approach Part Two: ApplyingGUM to the Chain of SAR/InSAR Processes

Central to performing the above analysis, is understanding the chain of complex operations involved in transmitting and receiving a SAR signal, processing the received data to form an image, and processing the image to generate a thematic map that identifies classes or feature types. Each of these operations contributes sources of error, of which some have been overlooked in the traditional error assessment process. These basic processes are depicted in Figure 1.6. The SAR antenna sends out a series of pulses that interact with the landscape to form the returned raw signal data. This data is processed through a series of detailed operations, and output as an image and corresponding mathematical data. This data in turn provides the input for image classification models and algorithms. Finally, digital and/or hardcopy maps are generated, which include topography and feature classes such as forest, grassland, cropland, urban areas, and water.

21 Digital/hardcopy Terrain/ VL / ^ — | Antenna Image Classifi­ and maps: topography, landscape / ^ cation Models/ Processor <= > Algorithms landscape classes

raw signal data data landscapes

Figure 1.6 Four basic processes in generating landscape maps from SAR.

First, the terrain and landscape characteristics (part 1) and how they interact with

the SAR signal will be studied in light of GUM guidelines, emphasizing the physics of

the scene and signal and their interactions. This is particularly important, since all of the processing steps that follow (up to the final classified map) rely on the raw signal as it is returned from the scene. The essential components to be analyzed using GUM are: electromagnetic plane wave theory [39,40]; the influence of landscape characteristics such as texture [41], material composition [42], and moisture content [43] on the amplitude, phase, and polarization response; the influence of parameters in the radar equation [44]; the effects of varying the sensor incidence angle, wavelength, flying height, polarization mode, and number of looks [45]; the effect of atmospheric conditions

[46]; and the effect of environmental phenology (change with daily/seasonal influences)

[47,48]. The primary variables that cannot be fixed or known in advance will be identified, and their range of expected values will be explored. Having a knowledge of this range for unknown variables will allow us to project what are reasonable expectations of classification repeatability and reproducibility, for similar features across a variety of scenes. The primary outcome of this part is expected to be a strong

22 understanding of the limits of repeatability and reproducibility based on the physics of

the scene, the signal, and the interaction with particular types of landscape characteristics.

Next the antenna and processor (part 2) will be studied as they pertain to

converting the one-dimensional analog input signal to 2-D and 3-D digital output

products. While some of the primary parameters of a given antenna and processor

configuration are well established, others are not and vary from system to system. Figure

1.7 is a flowchart of the primary internal functions involved in producing the output data,

including calibration. The uncertainty associated with these primary functions will be

studied as well as their cumulative effect and the potential variance from system to

system.

Image Classification Models and Algorithms (part 3) will be studied next, again

through the methodology of GUM. Some of the models in usage follow established

remote sensing classification techniques [50,51,52], others develop and apply forward or

inverse electromagnetic scattering models [53,54,55], while yet others apply machine

learning and/or statistical classification techniques [56,57,58]. Examples of the types of

supervised and unsupervised classifiers in use include Bayes Maximum Likelihood,

probably density functions (PDF), Markov Random Fields (MRF), and neural and fuzzy classifiers. Examples of statistical and mathematical algorithms in use include the Fast

Fourier Transform, fractal analysis, local statistics, K-nearest neighbor, and Gabor wavelets. Knowledge-based classification approaches will also be explored, which use hierarchical and mles-based methods [also 56,57,58]. The fundamental principles of classification [59,60,61] and the means for assessing the accuracy of a classified map

[62] will also be studied.

23 RON SIGNAL RAW SIGNAL s i f ï j s DATA DATA

I^TELÊIAETRY; ^ DECODE : -âsayâssk^--(HEADER) ; DERIVE/ AUTOFOCUS MODIFY AND ■«HOE SEECniini RANGE REF CLUTTERLOCK rïD:wm%cioNS HISTOGRAM CALTONE ESTIMATE; 1 ; ' r ' r QA ANALYSIS

y i"-‘> »r

GENERATE AZ. REF. G(®i) FUNCTIONS COMPRESSION CORRECTIONS

SCtECTED t-LOOK DATA

_ GEOMEfRIC' iCORRECTION/ û GEOCODING g IMAGE H/V M U U IC E

w t m m i m i STOKES MATRIX LOOK CAD*

SKINAL DATA FLOW '•••'fcËôS" CAUDRATION PARAUETEAS IMAGE FORMATTING DATA FLOW

OUTPUT

Figure 1.7. Flowchart of SAR processor functions, (from [49])

The main purpose of this part of the study is not to understand the fine details by which particular algorithms are implemented, but rather to gain an understanding on how to apply GUM to this aspect of processing.

24 Standard map output requirements (part 4) will also be studied. It is interesting to note that none of the published studies, as seen in the literature, reference a standard when performing classification on a given set of SAR/InSAR data. Each study establishes a set of unique classes, which tend to be based simply on the predominance of certain features in the image and the operator’s somewhat arbitrary assessment of what constitutes a given class. Yet classification and representation standards should provide an essential framework for producing map series that are consistent, reliable, and comply with established accuracy statements. For this part of the effort, two dominant standards will be studied. These standards (to be studied in an overview fashion) include the

Vegetation ClassiEcation Standard [63] and the Federal Geographic Data Conunittee’s emerging standards on the National Spatial Data Infrastructure [64], as they relate to thematic mapping. These standards are selected because they are expected to have a wide impact on the user community, and are generally representative of other thematic map standards. As with the first three parts of the process, GUM guidelines on repeatability and reproducibility will be applied to the study of standard map output requirements.

The GUM guidelines also prescribe standard methods for handling dependent variables, such as when one part of a process interacts with another. The final combined assessment of accuracy will follow the expanded treatment prescribed by GUM.

1.6 Expected and Potential Outcomes

From this work, the primary expected outcome will be the development of a preliminary methodology that applies the principles of GUM to the thematic

25 classification of SAR/InSAR data. The methodology will aid in better determining the

quality of the final result and its fitness for the purpose to which it is being used, by

associating a fuller measure of confidence with the results.

It is hoped that the results, when published, will catalyze efforts by a recognized

body to develop a more formal guide. Such results, in turn, could significantly advance

SAR/InSAR classification from the realm of expensive and time-intensive, one-of-a-kind

studies toward the cost-effective and efficient production of thematic map series. To repeat an earlier statement, the research may also “suggest new avenues of inquiry; new quantities to measure; and new relationships among variables.”

The developed theory and methodology may also be useful to other areas of mapping science and applications for arriving at more rigorous standards and practices.

Such areas might include, for example, other types of remote sensing practices and map classification standards for both raster and vector map series.

1.7 Notes and References

Notes

' Freeman, Anthony. E-mail exchanges and telephone conversations in February 1999 regarding his research at the Jet Propulsion Laboratory.

^ A keyword search of “Synthetic Aperture Radar” in the Commerce Business Daily (http://cbd.dos.com/cgi-bin/cbd.cgi~)for the years 1995-2001 listed 27 different SAR platforms and missions sponsored by the U.S. Department of Defense, such as: P/3 SAR, Eagle Vision SAR, FOPEN, HPACE, TUAV, MARS, JOINT STARS, ACTD, TESAR-ATR, EBRA, lADSS, and AVIS. A clear trend away from concept and demonstration projects toward operational missions can be observed over this time.

^ Berg, Richard (NIMA program director), in a 10/30/99 e-mail from B ergR@ nima.mil to [email protected] in feedback comments to proposal wrote, “We [NIMA] would like to see someone address issues more related to obtaining reliable feature classifications with a combination of SIR-C and X-SAR SRTM data.” (e-mail copied to Stevenson @ cfm.ohio-state.edu on 11/1/99)

26 * A number of e-mails and phone calls were exchanged in fall 1998, and again in February and March 1999 to and from Richard Carande at Vexcel Corporation and Yunjin and Diane Evans at JPL. The full report, which provided technical details behind the statements made in [23], was referred to as the “Dom Giglio DARPA contract” to “compare map making using optical photography with IFSAR,” done in 1994 by Rob Ledner at Vexcel. Ledner and others at Vexcel, JPL, and DARPA were unable to locate the report.

References

[1] Henderson, Floyd and Lewis, Anthony, Editors. “Introduction,” Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2, New York: John Wiley & Sons, Inc., 1998, pp. 2-5.

[2] Raney, Keith, “Radar Fundamentals: Technical Perspective,” Principles and Applications of Imaging Radar. Manual of Remote Sensing, Third Edition. Volume 2, New York: John Wiley & Sons, Inc., 1998, pp. 82-83.

[3] Curlander, John and McDonough, Robert, Svnthetic Aperture Radar: Svstems and Signal Processing, New York: John Wiley & Sons, Inc., 1991, pp. 28-29.

[4] Ibid.. pp. 33-44.

[5] Toomay, J.C., Radar Principles for the Non-Specialist: Second Edition. Mendham, New Jersey: SciTech Publishing, Inc., 1998, pp. 146-154.

[6] Madsen, Soren and Zebker, Howard, “Imaging Radar Interferometry,” ------, Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. New York: John Wiley & Sons, Inc., 1998, pp. 359-360.

[7] Waite, W.P., “Historical Development of Imaging Radar, Geoscience Applications of Imaging Radar Systems,” RSEMS (Remote Sensing of the Electro Magnetic Spectrum, A.J. Lewis, Ed.), Association of American Geographers, 3(3): 1-22, 1976.

[8] Ulaby, F T. and Elachi, C., Editors, Radar Polarimetrv for Geoscience Applications. Norwood, MA: Artech House, Inc., 1990.

[9] Raney, Keith, “Radar Fundamentals: Technical Perspective,” , Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2, New York: John Wiley & Sons, Inc., 1998, p. 38.

[10] Ibid., p. 108.

[11] Treuhaft, Robert N. and Siqueira, Paul R., “Vertical Structure of Vegetated Land Structures from Interferometric and Polarimetric Radar,” Radio Science, Volume 35, Number 1, January-February 2000, pp. 141-177. 27 [12] Gamba, P., Houshmand, B., and Saccani, M., “Detection and Extraction of Buildings from Interferometric SAR Data,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 1, January 2000, pp. 611-775.

[13] Strozz, T., Dammert, P., Wegmuller, U., et al., “Landuse Mapping with HRS SAR Interferometry,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 2, March 2000, pp. 766-775.

[14] Zebker, H.A. and Villasenor, J., “Decorrelation in Interferometric Radar Echoes,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, No. 5, September 1992, pp. 950-959.

[15] Franceschetti, G. and lodice. A., “The Effect of Surface Scattering on IFSAR Baseline Decorrelation,” Journal of Electromagnetic Waves and Applications, Vol. 11, 1997, pp. 353-370.

[16] From RSMAS Technical Report TR95-003. SAR Interferometry and Surface Change Detection. Report of a workshop held in Boulder, Colorado, February 3-4, 1994. Published July 1995 by the University of Miami Rosenstiel School of Marine and Atmospheric Science, pp. 3-8.

[17] Fulghum, David A., ‘DARPA Looks Anew at Hidden Targets,” Aviation Week & Space Technology, January 6, 1997, pp. 56-57.

[18] JPL Publication 96-16. Operational Use of Civil Space-Based Svnthetic Aperture Radar, prepared by the interagency ad hoc working group on SAR, August 21, 1996.

[19] Raney, Keith, “Radar Fundamentals: Technical Perspective,” , Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2, New York: John Wiley & Sons, Inc., 1998, page 9.

[20] Byrd Polar Research Center, URL http://www-bprc.mps.ohio-state.edu. “Radarsat-1 Antarctic Mapping Project,” accessed 4-13-01 (references activities up to the year 2000).

[21] Tuell, Grady H., ‘The Use of High Resolution Airborne SAR for Shoreline Mapping,” from Object Recognition and Scene Classification from Multispectral and Multi sensor Pixels, addendum to the Proceedings of the ISPRS Commission III Symposium, held July 6-10, 1998, in Columbus, Ohio.

[22] Stevenson, Paula J., Literature Review on SAR for Mapping and Feature Extraction (Part D. an internal report indexing over 500 relevant books, articles and papers on SAR/InSAR, Center for Mapping at The Ohio State University, Columbus, Ohio, August 1998, (updated June 2001), 49 pages.

28 [23] Vexcel Corporation, URL http://www.vexcel.coni/Droi/ifsar/htmU accessed 10/22/98.

[24] Taylor, Barry N. and Kyatt, Chris E., NIST Technical Note 1297. Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results. Physics Laboratory of the National Institute of Standards and Technology, Gaithersburg, Maryland, September 1994, Foreword.

[25] URL http://www.scientificamerlcan.coni/askexpert/Dhvsics/phvslcs6.html.

[26] Diamond, William J., Practical Experiment Designs for Engineers and Scientists. Belmont, CA: Lifetime Learning Publications, 1981, p. 8.

[27] Mallows, Colin C., Design. Data, and Analysis. New York, New York: John Wiley & Sons, 1987, p. 73.

[28] Ibid., p. 75.

[29] Robinson, A., Sale, R., Morrison, J., and Muehrcke, P., Elements of Cartography. Fifth Edition. New York, New York: John Wiley & Sons, 1984, p. 524.

[30] Ibid.. p. 516.

[31] Freeman, A., Chapman, B., and Alves, M., MAPVEG Software User’s Guide (JPL Document D-11254). Pasadena, California: Jet Propulsion Laboratory, October 1993.

[32 ] ------, Radar Polarimetry for Geoscience Applications. Norwood, MA: Artech House, Inc., 1990.

[33] Brisco, Brian and Brown, Ronald, “Agricultural Applications with Radar,” , Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. New York: John Wiley & Sons, Inc., 1998, pp. 381-406.

[34] Dobson, M.C. and Ulaby, F T., “Mapping Soil Moisture Distribution with Imaging Radar,” ------, Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. New York: John Wiley & Sons, Inc., 1998, pp. 407-433.

[35] Guide to the Expression of Uncertainty in Measurement: First Edition 1995. International Organization for Standardization, 1995, printed in Switzerland.

29 [36] ------. NIST Technical Note 1297. Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results. Physics Laboratory of the National Institute of Standards and Technology, Gaithersburg, Maryland, September 1994.

[37] Croarkin, C., Measurement Assurance Programs. Part U: Development and Implementation. NBS Special Publication 676-11, Washington, D.C.: U.S. Government Printing Office, 1985.

[38] Guide Quantifying Uncertainty in Analytical Measurement. Second Edition. Eurachem, St. Gallen: BMP A, c. 2000/2001, URL http://www,measurementuncertaintv.org/mu/guide/index.htnil.

[39] Sears, F.W., Zemansky, M.W., and Young, Hugh, University Physics. Sixth Edition. Reading, MA: Addison-Wesley Publishing Company, 1982.

[40] Lorrain, P., Corson, D., and Lorain, F. Electromagnetic Fields and Waves: Third Edition. New York: W.H. Freeman and Company, 1988.

[41] Schistad-Solberg, A.H. and Jain, A.K., ‘Texture Fusion and Feature Selection Applied to SAR Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 2, March 1997, pp. 475-479.

[42] van Zyl, J., Zebker, H., and Elachi, C., “Imaging Radar Polarization Signatures: Theory and Observation,” Radio Science, Volume 22, Number 4, July-August 1987, pp. 529-543.

[43] Ed-Rayes, M.A. and Ulaby, F.T., Microwave Dielectric Behavior of Vegetation Material. Report No. 022132-4-T. Radiation Laboratory of the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, January 1987 (Contract NAG 5-480 from NASA/Goddard Space Flight Center, Greenbelt, Maryland).

[44 ] ------, Synthetic Aperture Radar: Systems and Signal Processing. New York: John Wiley & Sons, Inc., 1991.

[45] ------, Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. New York: John Wiley & Sons, Inc., 1998.

[46] Zebker, H.A., Rosen, P.A., and Hensley, S., “Atmospheric Effects in Interferometric Synthetic Aperture Radar Surface Deformation and Topographic Maps,” Journal o f Geophysical Research, Volume 102, Issue B4, April 10,1997, pp. 7547-7563.

[47] Jensen, John R., Introductory Digital Image Processing: A Remote Sensing Perspective. Second Edition. Upper Saddle River, New Jersey: Prentice Hall, 1996.

30 [48] Polidori, L., Caillault, S., and Canaud, J.-L., “Change Detection in Radar Images: Methods and Operational Constraints,” Proceedings IGARSS ’95, Firenze, Italy, July 1995, pp. 1529-1531.

[ 4 9] ------, Synthetic Aperture Radar Systems and Signal Processing, New York: John Wiley & Sons, Inc., 1991, p. 356.

[50] Schott, John R., Remote Sensing: The Image Chain Approach. New York: Oxford Uniyersity Press, 1997.

[51] Schowengerdt, Robert A., Remote Sensing Models and Methods for Image Processing: Second Edition. San Diego: Academic Press, 1997.

[52] Schanda, Erwin, Physical Fundamentals of Remote Sensing. Berlin, Germany: Springer-Verlag, 1986.

[53] Hopcraft, K.I. and Smith, P R., An Introduction to Electromagnetic Inyerse Scattering. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1992.

[54] Ulaby, F T., Moore, R.K., and Fung, A.K., Microwaye Remote Sensing: Actiye and Passiye. Volume 11: Radar Remote Sensing and Surface Scattering and Emission Theory. Reading, MA: Addison-Wesley Publishing Company, 1982.

[55] Golden, Borup, Cheney, et. al, “Inyerse Electromagnetic Scattering Models for Sea Ice,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 5, September 1998, pp. 1675-1704.

[56] Mitchell, Tom M., Machine Learning. New York: McGraw-Hill Companies, Inc., 1997.

[57] Briscoe, Garry and Caelli, Terry, A Compendium of Machine Learning. Volume 1: Symbolic Machine Learning. Norwood, New Jersey: Ablex Publishing Company, 1996.

[58] Rich, Elaine, and Knight, Key in. Artificial Intelligence. Second Edition. New York: McGraw-Hill, Inc., 1991.

[59] Marr. Dayid. Vision: A Computational Inyestigation into the Human Representation and Processing of Visual Information. San Francisco: W.H. Freeman and Company, 1982.

[60] Booch. Grady. Object-Oriented Analysis and Design with Applications: Second Edition. Reading, Massachusetts: Addison-Wesley, 1994.

[61] Green, Dayid M. and Swets, John A., Signal Detection Theory and Psychophysics. New York: John W iley & Sons, Inc., 1966. 31 [62] Richards, John A., Remote Sensing Digital Image Analysis: An Introduction, Berlin: Springer-Verlag, 1986, pp. 231-235.

[63] Vegetation Subcommittee, Federal Geographic Data Committee, FGDC-STD-005: Vegetation Classification Standard^ June 1997.

[64] URL bttp://www.fgdc.gov/nsdi/nsdi.html.

32 CHAPTER 2

THE SCIENTIFIC METHOD

Since we are assured that the all-wise Creator has observed the most exact proportions o f number, weight and measure in the make o f all things, the most likely way therefore to get any insight into the nature of those parts o f the Creation which come within our observation must in all reason be to number, weigh and measure. [1] —Stephen Hales, 1727 A.D.

In this chapter we briefly depart from the proposed methodology in order to

examine the more basic concepts of the scientific method, paradigms, normal science,

theories, and scientific experimentation are discussed. Figure 2.1 depicts the relationship

among these topics, in which the scientific method guides the design of paradigms,

theories, and experiments. An underlying foundation is that those who practice the

scientific method are expected to do so with integrity, and therefore some discussion is

given to this topic as well. As will be seen, these concepts offer a further foundation for

a critique of the current method. These concepts will then be compared and contrasted with current practices for classifying SAR/InSAR data.

Certain precepts define the scientific method, which can be described as reproducibility, explanatory power, and falsifiability [2]. Reproducibility (in a more general sense than presented in GUM) describes the need for measurements and observations to be described meticulously. In principle, anyone with the proper training should obtain the same results from the same procedure. Non-repeatable measurements

33 The Scientific Method -reproducibility Ü-ëiroaâ^ -explanatory power -falsifiability m Paradigm -school o f thought -provides theory, methods, and standards

Normal Science -constrained by paradigm

Theory -testable hypothesis about an aspect of normal science-

Scienrific Experiment -project strategy -experimental strategy -hypothesis based

Figure 2.1. The scientific method, through paradigms that direct normal science, offers a framework for the development of theories and scientific experiments. (The results of such experiments must be reported with scientific integrity.)

34 are usually either ignored or repudiated. Explanatory power means that the explanation is not completely ad hoc, that is, it can explain beyond the particular situation that is being studied. In contrast to explanatory power, an ad hoc measurement is made only for the specific purpose, case, or situation at hand and for no other. Finally, falsifiability means that a theory must make specific predictions that have the potential to be proven wrong. It is easy to obtain confirmations, or verifications, for nearly every theory—if we look for confirmations. Every good scientific theory is a prohibition: it forbids certain things to happen. The more a theory forbids, the better it is.

All real tests are attempted refutations [3], that is, attempts to prove that a theory is in error. For example, Einstein’s theory of gravitation clearly satisfied the criterion of falsifiability because there was a clear chance to refute the theory. However, astrology did not pass the test, and hence can be characterized as non-scientific, or a pseudo­ science. Astrologers were impressed and misled by what they believed to be confirming evidence—enough so to ignore unfavorable evidence. By making their predictions and interpretations vague, they could explain away anything that might have been considered a refutation. In order to escape falsification they destroyed the testability of their theory.

(“It is a typical soothsayer’s trick to predict things so vaguely that the predictions can hardly fail.”) [4]

2.1 Paradigms. Normal Science, and Theories

Scientific research (and the corresponding measurements) is guided by

“paradigms, or schools of thought, which suggest important questions, define appropriate methodological approaches to answering these questions, and determine answers to these questions.”[5] A new paradigm is a shift of thought or perspective—instead of seeing

35 only the outside of a box from above, it sees the inside of the box from below. Paradigms are the framework by which a particular scientific practice is conducted. They can also be described as:

“...past scientific achievements that some scientific community acknowledges for a time as supplying the foundation for its further practice. A very important aspect of paradigms is that they cannot be disproved or rejected on the basis of scientific evidence. They can only be rejected in favor of another paradigm that promises to serve as a better, more encompassing, foundation for research. It is important to recognize that paradigms are not to be equated with scientific theories. Theories make predictions that can be subjected to scientific tests. Paradigms provide a framework in which to conduct and interpret research. They can be judged (though perhaps in hindsight) by how well they serve this purpose, not by whether they are right or wrong. A paradigm suggests which experiments are worth performing and which are not.” [6]

A paradigm is important, because without one all facts that might relate to the development of a given science can seem equally relevant. Thomas Kuhn, in The

Structure of Scientific Revolutions 171. suggests that normal science follows an existing paradigm. In fact, most scientific activities can be considered as the practice of normal science, described as:

“Perhaps the most striking feature of the normal research problems is how little they aim to produce major novelties, conceptual or phenomenal. One of the things a scientific community acquires with a paradigm is a criterion for choosing problems that, when the paradigm can be taken for granted, can be assumed to have solutions. To a great extent these are the only problems that the community will admit as scientific or encourage its members to undertake...

“In learning a paradigm the scientist acquires theory, methods, and standards together, usually in an inextricable manner. Therefore, when paradigms change, there are usually significant shifts in the criteria determining the legitimacy both of problems and of proposed solutions.” [8]

Normal science is characterized by one or more of the following cases [8]:

36 1. The group of facts elevated by the paradigm is used to solve problems, in

order to achieve more precision and extend it to a greater variety of

conditions.

2. The existence of a given paradigm determines the problem to be solved, and

often the design of mechanisms to solve the problem.

3. Empirical work is conducted, in order to further explain the paradigm theory

by resolving some of its ambiguities and solving residual problems.

However, a change in paradigm does not necessarily imply a large shift in perspective, or even affect those outside a single community comprised, perhaps, of less than 25 people. [9]

In contrast with a paradigm, a theory always remains a hypothesis, unless or until it is disproven. Theories are put forth tentatively and tested; if the outcome shows it is wrong, it is eliminated. The method of trial and error is essentially a method of elimination.[9] A theory can never be proven true in all cases, since we do not have an infinity of time, situations, or equipment in which to explore all possible cases. All we can do is “grope for truth even though it is beyond our reach.. .that truth is beyond human authority...for without this idea there can be no objective standards of inquiry, no criticism of our conjecture; no groping for the unknown; no quest for knowledge.”[10]

In Conjectures and Refutations: the Growth of Scientific Knowledge. Dr. Karl

Popper sets forth three requirements for a good theory [11]:

1. The new theory should “proceed from some simple, new, and powerful,

unifying idea about some connection or relation (such as gravitational

attraction) between hitherto unconnected things (such as planets and comets)

37 or facts (such as inertial and gravitational mass) or new ‘theoretical entities’

(such as field or particle).”

2- The new theory should be independently testable. It must “have new and

testable consequences (preferably consequences of a new kind); it must lead

to the prediction of phenomena which have not so far been observed.”

3. It should “pass some new, and severe, tests. This can only be done by testing

the theory empirically. A theory may be ad hoc if it is not independently

testable by experiments of a new kind. A theory which is not refutable by any

conceivable event is non-scientific. Every genuine test of a theory is an

attempt to falsify it, or refute it.”

2.2 Designing a Scientific Experiment

Next we examine the basic principles for conducting a scientific experiment, as a framework for critiquing current S AR/InS AR practices. The content of this section is derived from the book by William Diamond, Practical Experiment Designs for Engineers and Scientists [12]. A good experimental design relies on an understanding of statistics.

First, a total project strategy is established, then a set of experiments are devised that fit within that strategy.

A good project strategy consists of two basic parts: (1) defining the objective of the project as specifically as possible; and (2) defining the total experimental space in which a satisfactory product or process is expected to be found, including: (a) specifying all of the variables of the product or process that could influence the quality or performance of the end product or process, (b) defining the reasonable range of interest for each specified variable, and (c) defining all the responses of interest.

38 In a good experimental strategy, it is important to establish early whether a product or process with the specified properties is feasible in the defined variable space.

The correct strategy for a specific experiment can only be determined if its objective is stated correctly and completely. In advance of performing the experiment, it should be decided how much information is needed from that experiment, and what confidence level the final results need to achieve. When these facts and the number of variables to be included in the experiment are determined, “the correct experimental design is virtually automatic”[13]. Before experimentation begins, the test methods and test equipment to be used must be specified, and the validity and precision of the test methods must be validated.

As mentioned earlier, science does not allow us to know and understand the world around us precisely. We can only perceive reality, with varying degrees of clarity, by means of hypotheses, theories, and experiments. Even the most precise experiments can only be meaningful within certain tolerances, confidence levels, and assumptions. We do not have the means for testing every single incidence of the phenomenon being investigated, therefore we rely on the use of the much smaller set of representative samples. The samples used need to represent the entire population being studied at large.

Statistics are important because, without this tool, experimenters can vary in their interpretation of the same data. Statistics serve to link the sample data with the real population parameters.

The universe is comprised of mathematical relationships; some of the relationships are very simple, while others are very complex [13]. Every set of data

39 obtained from a sample produced under a fixed set of conditions can typically be defined

by three numbers that can be calculated from the data; the number, the mean, and the

variance of the test results.

Usually, the purpose of an experiment is to compare the difference between two

populations. Li other words, do the results indicate that the old product or process is

equal to, better than, or worse than the new product or process? Statistics can aid the

experimenter in the decision-making process. There are four major steps in establishing

guidelines before an experiment is conducted:

• State the two alternative decisions.

• Define the acceptable risks for selecting the wrong alternative.

• Establish an objective criteria for selecting between the alternative decisions.

• Compute the requisite sample size.

Each of these is discussed below.

When comparing properties of interest between two populations, the experimenter

will be confronted with two alternative possibilities: either the properties of interest are

essentially the same for the two populations; or the properties of interest are significantly

different between the two populations [14]. In statistical terms, these alternatives are

stated as:

• Null hypothesis (Ho)—no essential difference exists between the properties of

interest in the two populations.

• Alternative hypothesis (Ha)—a significant difference does exist between the

properties of interest in the two populations.

Results are analyzed statistically and in one form may be presented statistically as:

40 • Null hypothesis

Ho: Pi = P2 for population means.

Ho: = CT2^ for population variances.

• Alternative hypothesis

Ha: Pi # P2 for population means.

Ha: CTi" <5z for population variances.

(Note: the symbols for mean and variances are given here as presented in [14].

However, after this chapter, these symbols will change meaning to follow standard GUM

and SAR conventions).

It is impossible to prove a null hypothesis correct, but it is relatively easy to prove a null hypothesis incorrect. (This is another way to describe the concept of falsifiability.)

Therefore, the proper decision-making procedure is to try to prove with high probability that the null hypothesis is false. If, with statistics, the null hypothesis is proved false, the experimenter can accept the alternative hypothesis to be true. Or, if the null hypothesis cannot be proved false, it will be accepted as true. Further details on statistical methods will be presented in Chapter 3.

2.3 Scientific Integrity

Without integrity, scientific results could not be relied upon. For this reason a foundation of science is that results must be reported with scientific integrity. Prof.

Richard Feynman, a well-known physicist, describes scientific integrity as [15]:

“A principle of scientific thought that corresponds to a kind of utter honesty—a kind of leaning over backwards. For example, if you’re doing an experiment, you should report everything that you think might make it invalid—not only what you think is right about it: other causes that could possibly explain your

41 results; and things you thought of that you’ve eliminated by some other experiment, and how they worked—to make sure the other fellow can tell they have been eliminated.

“Details that could throw doubt on your interpretation must be given, if you know them. You must do the best you can—if you know anything at all wrong, or possibly wrong—to explain it. If you make a theory, for example, and advertise it, or put it out, then you must also put down all the facts that disagree with it, as well as those that agree with it. There is also a more subtle problem. When you have put a lot of ideas together to make an elaborate theory, you want to make sure, when explaining what it fits, that those things it fits are not just the things that gave you the idea for the theory, but that the finished theory makes something else come out right, in addition...If we only publish results of a certain kind, we can make our theory look good. [However,] we must publish both kinds of results.

He summarizes with the following remarks [15]:

“...the idea is to give all of the information to help others to judge the value of your contribution; not just the information that leads to judgment in one particular direction or another.. .We’ve learned from experience that the truth will come out. Other experimenters will repeat your experiment and find out whether you were wrong or right. Nature’s phenomena will agree or disagree with your theory. And, although you may gain some temporary fame and excitement, you will not gain a good reputation as a scientist if you haven’t tried to be very careful in this kind of work. And it’s this type of integrity, this kind of care not to fool yourself, that is missing to a large extent in much of the research in [pseudo]science.

“All the para-psychologists are looking for some experiment that can be repeated—that you can do again and get the same effect—statistically, even. They do a lot of things to get a certain statistical effect. Next time they try it they don’t get it any more. And now you find [the experimenter] saying that it is an irrelevant demand to expect a repeatable experiment. This is science?” [15]

We will apply the above concepts of science, paradigms, theories, experimentation, and scientific integrity, and compare and contrast them with SAR/InS AR classification practices in the following sections.

42 2.4 SAR/InSAR classification practices: science or pseudo-science?

Next we examine general SAR/InSAR classification methodology, in light of the

guidance of standard scientific practices. The main interest in this section is identifying

where this methodology may be deficient. In such areas, does the scientific method

suggest possible improvements? Each of the above concepts—the scientific method,

paradigms, normal science, theories, and scientific integrity—will be discussed in the

following sections. The basis for these general observations is an extensive study of the

methodology as it is practiced in the scientific literature, and particularly in refereed journals, since they are recognized as having greater credibility than conference papers

and other types of literature. This is an important distinction, because although excellent

instances of the practice of the scientific method in projects may exist, since they are not

reported in the literature their status is unknown to this study.

2.5 Critique on Current Paradigm

To review, a paradigm describes a particular school of thought and offers a

framework for the accompanying theory, methods, and standards. In this section, the current SAR/InSAR classification paradigm and its likely origins are examined.

An extensive search for something like a paradigm to guide activities in this field yielded little information of useful value. Most of the early literature in the field simply discusses the new technology and how it works; electromagnetic theory as it relates to signal-scene interactions; and classification practices that follow other remote sensing studies closely (such as Landsat classification studies). The emphasis is on methodology and results, not on SAR/InSAR classification theory. Later literature offers more of the same; a formal paradigm apparently continues to be lacking. In its place can be

43 Select scene(s) May apply preprocessinj

Choose classes based on prominence in image

May use model Apply new variation of a May use training and/or input supervised or unsupervised data from same scene physical parameters classification technique

Adjust and optim ize May assess accuracy parameters and classes until results “look” good

Publish results

Figure 2.2. Typical methodology for single-use ^disposable * classifiers, found throughout the scientific literature on SAR classification. The purpose of most studies is to show that the reported classification method is an improvement over earlier methods, by demonstrating the method on one (or a few) images.

observed an informal, common understanding on how this practice is to be conducted and reported in the scientific literature, following the general methodology practiced by earlier authors. Figure 2.2 characterizes this methodology. Even though the studies differ widely in the algorithms presented, the types of datasets used (different sensors, polarizations, frequencies, resolutions, terrains, land cover, etc.), and the number and kinds of classes, the vast majority of studies follow the methodology shown in this figure.

A scene (i.e., a dataset corresponding to an area on the earth’s surface) is selected, often followed by preprocessing to minimize speckle and other effects. Classes such as

44 forest, grassland, buildings, etc. are selected based on their apparent dominance in the

scene. Then a new variation of an algorithm and/or model is introduced and applied to

classify the image. Parameters are adjusted and optimized until results are satisfactory

and the results compare as favorably as possible with the ground truth. In cases where

training data are used, the data are usually taken from the same scene on which the

testing is conducted. In many cases the accuracy of the now-classified data is evaluated.

Finally, results are published.

Although a formal paradigm could not be identified, it was observed that the

SAR/InSAR classification methodology has been adapted from at least three fields and

their respective (and more developed) paradigms: raster generalization, remote sensing,

and machine learning. A discussion of these “parent” paradigms can be found in

Appendix A, as well as problems associated with the related act of classification. (For

further references on the summary of the main points that follow, please consult this

Appendix.) Each field offers complementary, developed bodies of theory and

methodology. However, it should be noted that, in the informal adaptation of certain principles and in their combination into the current SAR/InSAR classification

methodology, some key elements of the original three paradigms have apparently been overlooked. This situation has resulted in less rigor in the approach taken, suggesting less reliable outcomes. The three fields (paradigms) offer the following contributions to

SAR/InSAR classification:

• Raster map generalization offers a general theoretical framework for

defining and specifying the experimental parameters, since classification is

really an exercise in raster generalization—a subset of map generalization.

45 • Remote sensingprovides concepts and techniques for assessing the accuracy

of final results.

• Machine learning/pattern recognitionoffers insights into the importance of

appropriate representations in models and algorithms.

From a study of these three paradigms, a number of shortcomings in the current practice of SAR/htSAR classification became evident, falling short of the guidelines offered by the respective paradigms. Each shortcoming is summarized below.

2.5.1 Cartographic Map Generalization

The principles of the cartographic map generalization paradigm, as applied to

SAR/hiSAR classiEcation, tend to be weakly applied in the following ways:

• Explicit statements are not usually offered as to what are the important

features to classify, the intended feature prominence, and why. Rather, a

study tends to choose classes as a matter of convenience, according to the

dominant characteristics of a given scene. There is often a priori selection of

a particular scene to align with the desired features sought.

• Features/classes are not clearly and explicitly defined (capture conditions).

“Oh, this looks like a forest so weTl label it as one.” What one operator

considers to be forest may differ considerably from the next one.

• Explicit statements are generally not offered in the scientific studies as to what

scale is needed in the end product. Rather, the studies tend to be dataset-

driven: from this dataset we can capture these characteristics. “Let’s see what

this sensor is capable of doing,” rather than, “What are the specifications for

46 the job that needs to be done, and to what degree can this sensor meet those

specifications?”

• Physical, physiological, and psychological limits of the sensor, hardware,

software, and technicians and users, tend to not be taken into account. Rather,

the main emphasis is the novelty of the approach as demonstrated on a single

scene, regardless of the amount of effort it takes to get the results.

• The reliability, precision, and completeness of the mapped data are not

accurately represented, because the assessment omits a great deal of

information about the sensor, processing, algorithmic, classification, and other

types of error, beyond a simple "truth table” on a single scene.

Part of the critique reflects the fact that the studies are designed to be exploratory, new, and cutting-edge, since they are written for research journals. But in this process, the studies overlook the need to examine the limits of consistency and the reliability of a particular approach under somewhat different conditions—which is essential information in moving the technology from one-of-a-kind research studies into a production mode.

2.5.2 Remote Sensing Classification Paradigm

A study of the remote sensing classification paradigm leads to the following critique of current practices in SAR/InSAR:

• There is no assessment as to whether the training classes fully characterize all

of the classes (it is easy to observe that they do not, especially when our

interest is cross-scene classification).

47 • An insufficient number of training examples are typically used (less than 10

training examples per class in SAR is typical; while ICO per class is desirable,

[ref. 13 in Appendix A]

• In many cases error assessment is not done at all, but when practiced, rarely

includes the use of at least 50 collected samples per class for the error matrix,

via stratified random sampling (a preferred method).

2.5.3 Machine Learning Paradigm

The machine learning paradigm offers several important points in critiquing the

methodology for classifying SAR/InSAR data, as reported in the literature:

• A discussion of the representation and its appropriateness to the given

conditions is rarely offered.

• A discussion of the role played by the physics of the scene and sensor is rarely

offered.

• A discussion on regularities and constraints of the particular dataset or method

is rarely given.

• The algorithmic procedure for the classification mechanism tends to be

described in great detail, but other procedures are scarcely mentioned.

• In the experimental phase, the computational complexity in time and space is

rarely mentioned. No matter that the dataset size might have been only 500

pixels by 500 pixels, and it took 1 gigabyte of disk space and 12 hours to

classify it!

The use of certain domain knowledge (i.e., feature classification standards) for thematic mapping will be explored in Chapter 8 as a means to define the desired classes.

48 Without such a framework, the definition of what constitutes a given class will continue

to vary for different researchers, further complicating comparisons. In summary, the

following problems related to SAR/InSAR classification are noted:

• The feature classes and conditions under which they are classified tend to not

be clearly defined.

• The learning problem—how to group classes—is not well-defined.

• How to evaluate classifications is not sufficiently defined.

• Algorithms are optimized on single scenes under very limited conditions.

• Models are often based on regional statistics only, and do not incorporate the

physics of the scene.

As seen in the foregoing discussion, the current (informal) methodology of Figure

2.2 has a number of significant shortcomings. These problems, in turn, limit the

effectiveness of the resulting theories and methods of experimentation. The dominant

theories and models in SAR/InSAR classification will be further examined and critiqued

in Chapters 5,6,7, and 8, and further discussion will be deferred to these chapters.

2.6 Critique on SAR/InSAR Classification and the Scientific Method

In this section the scientific method and its three criteria—reproducibility, explanatory power, and falsifiability—are compared with common practice in

SAR/htSAR classification. Each criteria is examined in turn.

1. Reproducibility — measurements and observations need to be described

meticulouslv. Is this the case?

Measurements and observations of the landscape conditions and their effect

on the signal are rarely described except in a few terse statements, yet

49 landscape conditions (such as moisture levels) and composition can play a

major role in varying the signal response. The algorithm used to process the

measurements is, in contrast, usually described in great detail. Because of the

lack of description, there is simply not enough information to assess

reproducibility.

2. Explanatory power. Can it explain behind the particular situation being

studied?

Some of the models have limited explanatory power (to be discussed in

Chapter 7), but, in general, they have not been tested under a wide enough

range of controlled (or realistic) conditions to interpret the real effectiveness

of the models.

3. Falsifiabilitv: Does it follow a theory’s method of constructing a hvpothesis.

and trying to disprove the null hypothesis?

The typical inferred hypothesis is that method “X” is a better classifier than

method “Y” or “Z” in other studies (or sometimes within the same study),

since “X ’ has a higher accuracy rate— as generated by the truth table. Yet this

method of comparison is fundamentally flawed. Success of one method in

general cannot be compared with another in the current paradigm, since

accuracy assessments and classifications are relative, and often deal with

different domains (or landscapes).

The scientific literature in this area tends to be dedicated to novelty, rather than to understanding the factors that contribute to consistency of response. The outcome is that there is no common way to assess or compare results. Without a standardized system for

50 making comparisons, it is impossible to determine whether one method is more effective than another—and the truth table (as a means for relative accuracy assessment) does little to help.

The typical practice is that a new or modified classifier is tuned to a specific scene, in order to obtain minimum classification error. SAR/InSAR image classification largely follows the more traditional multispectral classification (i.e., remote sensing) paradigm. Mitchell reports, “A major limitation of multispectral classifiers, as they are normally used, is that the classifier is scene-specific and cannot be used on any other scene” [16]. Schott also reports, “...it is often not cost effective to build extremely elaborate classifiers only to discard them after a single use” [17].

This means that the scientific literature and practice, by emphasizing novelty, actually do little to advance its discoveries into a consistent production environment. This results in the following situation:

“Direct comparisons between two algorithms are seldom possible because independently developed lU [image understanding/SAR classification] algorithms are seldom designed to function in the same domain under the same circumstances..” [18]

Comparison and evaluation are made difficult by the fact that different resolutions, sensor wavelengths, polarizations, terrain, etc. are featured in the various studies. As a result there is no conunon basis for evaluating the efficacy of a given approach. Further, on the issue of comparing different systems, Briscoe and Caelli report:

“It is difficult to evaluate and compare different systems as they often produce very different types of classification descriptions. Classification accuracy [e.g. confusion matrix] is one measure of performance that is often used, but this suffers from the problem of insufficient training data to obtain appropriate generalizations or unbiased estimates of the classification probabilities. Further, cross-domain comparisons are difficult...”[19]

51 So, is hypothesis A (or experiment X) better than hypothesis B or C (or experiment Y or

Z)? No one really knows with certainty. Further, the continued practice of running tests

on single images poses additional problems:

‘The role and design of experiments...illustrates...methodological points and the efficacy of formal analysis and theoretical approaches. Experimental studies [are important] as a way of illuminating the nature of learning mechanisms, for discovering the reasons for their success or failure, and for comparisons of existing techniques...[However] tests on a single domain are not sufficient to draw reliable conclusions about the relative performance of algorithms.” [19]

In the final section, the issue of scientific integrity in SAR/LiSAR classification is

discussed.

2.7 Critique on SAR/InSAR Classification and Scientific Integrity

Some guidelines for scientific integrity were discussed earlier. The four main

points of that discussion were: (1) report everything that might make results invalid or

throw doubt on the interpretation; (2) discuss facts that disagree with the theory or

approach; (3) discuss how well the theory explains things beyond the particular case under study; and (4) the experiment should be repeatable. According to these points, are

SAR/InSAR classification studies conducted with integrity? This discussion is not intended to challenge the efforts of the many scientists who have followed the current paradigm with thoroughness and responsibility, but rather it is meant to challenge the paradigm itself. As will be seen in the following chapters, points 1 and 2 are partially addressed in some cases, while points 3 and 4 are not addressed at all.

2.8 Summary

A common theme in all of the above observations is that there tends to be a lack o f rigor in determining the conditions for classification and assessing the true level of

52 uncertainty in the measurements. This indicates the need for a better-defined methodology for carrying out SAR/InSAR classification, which will be further examined in Chapter 3.

To its credit, the methodology depicted in Figure 2.2 works reasonably well for single scenes, and addresses a difficult topic. The algorithms and implementations are highly complex. However, the lack of a strong underlying paradigm and theory, based on established scientific principles, must be criticized.

2.9 A Case Study

To illustrate some shortcomings that accompany the current paradigm, a case study will be examined that involves the creation and evaluation of a global thematic map. This study was reported in the September 1999 issue of PE&RS [20]. The effort is indicative of state-of-the-art practices in remote sensing classification, as currently carried out by scientists around the world. Since no other validated remote sensing studies have yet been attempted on a scale of this magnitude, examination of the effort should provide insight into some of the limitations of the current paradigm.

From 1992 to 1999, an international effort resulted in the creation of the first-ever global thematic land cover dataset, for which the accuracy was validated using intemally- consistent practices. The effort was conducted under the auspices of the International

Geosphere Biosphere Program’s (IGBP) Data and Information System, for the purpose of meeting land cover requirements for certain studies in global change. The 1-km resolution data set included 17 cover classes, and was created from over 4.4 terabytes of data from the Advanced Very High Resolution Radiometer (AVHRR). [21, 22] This is the only effort of its kind and magnitude known to be undertaken to date.

53 Similar practices would likely be employed to create a global (or continental) thematic database from SAR/InSAR data (such as data from the recent SRTM mission).

This is because SAR/InS AR data are a type of remote sensing data, and classification practices in the literature can be observed to apply the same general methodology. For this reason, a closer examination of the AVHRR effort is warranted to identify potential and specific shortcomings.

2.9.1 The Methodology

In order to create an intemally-consistent, global thematic dataset, IGBP project teams formulated uniform schemes for developing the dataset, for classification, and for validation. One specification for the new database was that it should be “developed using an objective, repeatable, and systematic methodology.” [23, italics added] Another specification was that, to satisfy earth science requirements, the data should attain an accuracy level of at least 85%. Four main steps were devised to: (1) generate composites from AVHRR data, (2) perform unsupervised classification, (3) perform supervised determination of class labels, and (4) validate results [24,25,26]. Each of the four steps is summarized below:

I. Generate composites from AVHRR. A total of thirty-six 10-day composites

where generated from 1992-1993 AVHRR data. The 10-day composites were

then recomposited into 12 continental monthly NDVT (Normalized Difference

Vegetation Index) data sets. The recompositing “reduced atmospheric

contamination, decreased the effects of off-nadir viewing, and reduced the

NDVT data volume by two-thirds.” [23]

54 2. Perform unsupervised classification. The classification methodology involved

mapping land cover for each continent in turn. Unsupervised classification

was performed on the NDVT composites to identify greenness classes. Each

cluster was assigned a preliminary cover-type label on the basis of the spatial

patterns and spectral or multi-temporal statistics in each class. The urban class

was then added from the “urban” layer information from, the Digital Chart of

the World. [22]

3. Perform supervised determination of class labels. Each preliminary cover-type

label was compared with ancillary data. The determination of class labels was

made by a team of three expert interpreters, “with the final class labels based

on a consensus between interpreters.” [23]

4. Validate results. One of the PE&RS articles admitted that there was no

current state-of-the-art for validating global land-cover classification:

“While methods...based on confusion matrices...are well established, they have generally only been applied to local scale classifications, occasionally to regional scale work, and only in isolated instances to scales greater than these [22].”

Therefore, a new validation strategy had to be devised, based upon two assumptions:

(1) that at I-km resolution, the thematic and spatial accuracy of the classified data sets are inseparable, and (2) that higher resolution satellite images from sensors on Landsat or

SPOT provide accurate, independent reference data that describe the true land-cover classes. The core sampling strategy was based on state-of-the-practice procedures, with

“truth” sites identified in the higher resolution imagery using a stratified random sample of 25 points per class. [22]

55 2.9.2 The Results

The fact that a single methodology was applied infers that a consistent global classification should have resulted. However, after the above validation process was completed, the average class accuracy was determined to be only 59.4% [27]— considerably short of the 85% accuracy sought. (When fewer classes were used, the overall accuracy did increase somewhat [28].) The evaluations discussed four potential factors as influencing the accuracy of the classifications: (1) poor satellite data quality in some areas due to atmospheric effects, (2) variances in classification methods and interpreter skills, (3) limitations in reference data, and (4) resource limitations [23]. Each of the factors is discussed below, serving to emphasize some of the limitations in the current methodology.

2.9.2.1 Limiting Factor 1: Poor satellite data quality

Many of the composites were observed to contain artifacts from clouds, and other forms of atmospheric contamination were frequently observed. For example, nearly 47% of the South American regions had contaminated composites [23]. Land-cover characteristics in many parts of the world, and certainly in the topics, were affected by atmospheric and other environmental contaminants [29].

In many cases, poor image quality reduced the already ambiguous relationship between spectral data and land cover. Since sensor radiometric calibration, atmospheric effects, and sensor spectral and spatial response are key variables that affect image classification, scientists conjectured that improving these variables would improve the classification results [24]. Another article concludes that improvements in the geometric and radiometric quality of the AVHRR data must be a top research priority [29].

56 2.9.2.2 Limiting Factor 2: Variances in classification methods and interpreter skills

The authors’ remarks underscore the general problems in classification discussed earlier in the chapter, and expanded in Appendix A:

“...the subjective element inherent in image interpretation leaves the practice open to much criticism...it is evident that different interpreters approach the land- cover classes from many different perspectives.” [30]

“...it became clear that each interpreter had unique perspectives about different geographic domains and landscape types [23].”

“Classification methods have long been identified as an issue in determining classification accuracy. Any methodology should provide the flexibility to detect significant land-cover patterns consistently and under a wide range of environmental conditions. Results of such studies give rise to interesting debates but avoid the nagging issue: does computer-assisted image classification provide consistent land-cover classifications for large study areas? However, the impact of different parameters on classification accuracy [in this study] is uncertain...[and] is likely to be the least understood issue [23].”

“Based on this study. ..it can no longer be assumed that all.. .classes are equally interpretable on satellite spectral imagery...The results also clearly demonstrate the difficulty associated with the interpretation of many of these classes from remotely sensed data sets [27].”

From these comments, it is clear that variable class descriptors, indistinct class discriminators in the data, and variations in classification methods all continue to present significant problems in achieving consistent results.

2.9.2 3 Limiting Factor 3: Limitations in reference data

The authors justify the use of Landsat and SPOT data as “truth” with the following statement:

“The use of high-resolution data as a surrogate for reference data at least has precedents. Visual interpretation of higher resolution satellite imagery, either from hard copy prints or on screen using image processors, has been used for operational crop yield forecasting on a pan-European basis and has proved to be cost effective, capable of delivering consistent results over large areas, and easy to apply in an operational setting.” [31]

57 It is interesting to note that the authors report the discovery of a consistent offset

in the Landsat Thematic Mapper data, of approximately 1 km to the east and 1 km to the

south. The authors extrapolate that perhaps 20% of the pixels for the conterminous

United States were adversely affected by this registration error. [32] The resulting 1.4

km offset was only measured for the conterminous United States, so the actual

misregistration for other parts of the globe is unknown. The authors note that, “the

ubiquitous nature of the offset in this study certainly raises a global concern The effect

that this offset has on the validation accuracy statistics is difficult to ascertain.” [32]

Other authors commented:

“...there is legitimate concern with respect to the objectivity and repeatability of methods that rely heavily on interpretation and convergence of disparate data sources with unknown accuracy... Based on the results of this study,...the quest for accurate results with fully automated classification may not be achievable.” [32]

“...results more than adequately demonstrate the difficulty associated with the interpretation of many of the DISCover legend thematic classes from higher resolution LandSat TM and Spot data sets.” [29]

Moreover, the reference land cover and vegetation maps were of varying quality and time periods. Almost all were printed prior to 1992, and few references were available for South America and the Middle East. Authors noted that the scarcity of agricultural references was particularly troublesome in emerging agricultural areas of

South America, Africa, and the tropical Pacific. Because the map quality varied, all references had to be used with caution. [23]

2.9.2.4 Limiting Factor 4: Resource Limitations

To some degree, the choice of methodology was determined by the available data and resources. The IGBP study was an enormous undertaking, involving the efforts of

58 more than one hundred scientists and technicians over nearly a decade. Participation in the study was mostly voluntary and self-funded, with total direct funding of less than

$500,000 [22], plus indirect funding of perhaps several million dollars in contributed data and labor. These restrictions forced the use of methodologies that were less than optimal.

But for the same reasons, this study is highly instructive because broader-based thematic mapping efforts tend to be constrained by similar practices and funding restrictions.

Some thought that greater use of higher resolution satellite imagery during the interpretation process would have increased the accuracy of the results, but the high cost and sparse availability of contemporary Landsat TM data made this option un viable [24], and “more precise products come at a higher cost [32]”.

Others mentioned that staff and budget resources may have the greatest overall impact on the quality of results [32], and decried the difficulties in getting a funding agency to pay for the cost of validating a data set. Yet to participants, it was clear that,

“Funding must cover the range of global data set issues, from mapping to validation, as well as basic and applied research.” [29]

2.9.3 General Observations bv Authors

The authors concluded that the project was successful overall because, as a first, it served to “highlight results, identify limitations and uncertainties, and provide insights into future global mapping and validation initiatives.” [21] One article set forth the perspective that the accuracy results were reasonable because, “While there are no accuracy standards for large-area land-cover mapping with AVHRR data, a small number of research studies over small areas report accuracies ranging from 50-80%,” and the global accuracy results simply aligned with those of the earlier studies [23].

59 Other insightful observations highlighted different problems and needs:

“Perhaps the challenge to improving the quality of land-cover maps related as much to the data used in the classification as it does to algorithms or methods. Simply put, the problem may be the data. The relationship between land-cover and temporal-spectral data is too frequently ambiguous...Different land-cover types often have similar spatial, temporal, and spectral characteristics [29, italics added].

“Better understanding of the stability of the characteristics of global land cover...is also required. For example, the interannual variability of land cover at the global scale should be documented by comparing the results of this classification to data from other years in order to determine the historical variance of phenology and productivity.” [29]

“The effort to develop an improved land-cover characteristics database...achieved most stated objectives...the results provide evidence that the methodology was objective, systematic, and suited to the socioeconomic, cultural, and natural forms and patterns of land cover found across the globe. Whether or not the methodology is repeatable has yet to be established.” [23, italics added]

The last quote is particularly interesting, given that one of the leading specifications was that the method be repeatable. Another was that at least 85% accuracy was required.

2.9.4 IGBP Future Plans

The problems uncovered as a result of this research led to the establishment of several IGBP sub-groups to explore the issues raised in greater detail. A key topic under study is the validation of land surface parameters, which has the following objectives

[29]:

• Promote the quantification and characterization of satellite land product

accuracy;

• Share land product validation past experience and lessons learned;

• Move towards the generation of “standardized products with known accuracy”

from similar sensing systems in the context of data continuity;

60 • Establish relationships between like products, e.g., vegetation indices;

• Develop in-situ validation measurement standards, protocols, and traceability;

• Coordinate international validation activities; and

• Improve access to validation data sets.

But do these measures go far enough, if repeatability is the aim? Perhaps not—as we will see in the following chapters.

2.9.5 An independent critique o f the global mapping nroiect

Several conclusions can be drawn from this project:

• Without clear and unambiguous descriptions as to what comprises a given

class, consistent classifications cannot be achieved.

• Without consistent classifications, repeatability cannot be achieved.

• Relying on ambiguous, inaccurate, and outdated datasets as “truth” can, in and

of itself, introduce a significant and unknown degree of error.

These conclusions, in turn, refer us back to certain scientific principles discussed earlier:

1. The measurand must be clearly and unambiguously defined, as far as

possible. In this case, the measurand may be viewed as the class definition.

2. If a method or measurement cannot be demonstrated to be repeatable, then it

cannot be called a scientific method or measurement. For this reason, the

IGBP study, through its use of standard remote sensing classification

practices, cannot be considered scientific in its current form.

61 3- If ground truth is not traceable to a clearly defined, absolute standard which

itself has a clearly-defined measure of error associated with it, then it should

not be used as truth.

4- Any method used to assess uncertainty (i.e., for dataset verification) must

itself have a quantifiable component of uncertainty, which is taken into

account in reporting the results.

5. In order to achieve an unambiguous classification, in some cases it may be

necessary to use multiple data sources, becauseby their nature single imagery

sources are often ambiguous.

It is encouraging to note that national and international efforts to develop geospatial and measurement standards are underway. In time, these standards may facilitate a clearer understanding and establishment of the above principles. These recent and emerging standards will be examined in Chapter 3.

2.10 References

[1] Hales, Stephen, His Vegetable Staticks (1727), from http://www.english.upenn.edu/~ilvnch/Frank/People/hales.htmJ

[2] URL http://madsci.wustl.edu/posts/archives/apr99/925157418.Sh.r.html.

[3] Popper, Karl R., Conjectures and Refutations: the Growth of Scientific Knowledge. Second Edition. New York: Basic Books, Inc., 1965, Chapter 11.

[4] Ibid.. Chapter 1.

[5] Santrock, John W ., Adult Development and Aging. Dubuque, Iowa: William C. Brown Publishers, 1985, pp. 24-25.

[6] Kuhn, Thomas S., The Structure of Scientific Revolutions. Third Edition. Chicago and London: The University of Chicago Press, 1996, p. 10-22.

[7] Ibid., pp. 10-34.

62 [8] Ibid. pp. 36-110.

[9] Popper, Karl R., Conjectures and Refutations: the Growth of Scientific Knowledge, Second Edition. New York: Basic Books, Inc., 1965, Chapter II.

[10] Ibid.. pp. 22-30.

[11] Ibid. Chapter 10.

[12] Diamond, William J., Practical Experiment Designs for Engineers and Scientists. Belmont, California: Lifetime Learning Publications, 1981, pp. 3-38.

[13] Ibid. p. 7.

[14] Ibid. pp. 18-19.

[15] Feynman, Richard, Cargo Cult Science, adapted from a Caltech commencement address given in 1974, from the book Surely You’re Joking. Mr. Feynman!. URL http://pc65.frontier.osrhe.edu/hs/science/feynman.htm.

[16] Mitchell, Tom M. Machine Learning. New York: McGraw-Hill Companies, Inc., 1997.

[17] Schott, John R. Remote Sensing: The Image Chain Approach. New York: Oxford University Press, 1997, pp. 233-288.

[18] Defense Advanced Research Projects Agency (DARPA), “Image Understanding for Battlefreld Awareness,” Broad Agency Announcement (BAA) 96-14, issued September 12, 1996, Section C.

[19] Briscoe, Garry, and Caelli, Terry. A Compendium of Machine Learning. Volume 1: Symbolic Machine Learning. Norwood, New Jersey: Ablex Publishing Company, pp. 11,13.

[20] Morain, Stanley A., PE&RS — Photogrammetric Engineering and Remote Sensing, Special Issue: Global Land Cover Data Set Validation, Volume 65, Number 9, September 1999, pp. 1011-1093.

[21] Loveland, T.R., Estes, J.E., and Scepan, J., “Introduction,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1011- 1012.

[22] Belward, A.S., Estes, J.E., and Kline, K.D., “The IGBP-DIS Global 1-Km Land- Cover Data Set DISCover: A Project Overview,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1013-1020.

63 [23] Loveland, T.R., Zhu, Z., Ohien, D.O., et.al, “An Analysis of the IGBP Global Land- Cover Characterization Process,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1021-1032.

[24] Muchoney, D., Strahler, A., Hodges, J., and LoCastro, J., “The IGBP DISCover Confidence Sites and the System for Terrestrial Ecosystem Parameterization: Tools for Validating Global Land-Cover Data,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1061-1068.

[25] Brown, J.F., Loveland, T.R., Ohlen, D.O., and Zhu, Z., “The Global Land-Cover Characteristics Database: The Users’ Perspective,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1069-1074.

[26] Scepan, J., Menz, G., and Hansen, M.C., “The DISCover Validation Image Interpretation Process,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1075-1082.

[27] Scapen, Joseph, “Thematic Validation of High-Resolution Global Land-Cover Data Sets,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1051-1060.

[28] DeFries, R.S. and Los, S.O., “Implications of Land-Cover Misclassification for Parameter Estimates in Global Land-Surface Models: An Example firom the Simple Biosphere Model (SiB2),” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1083-1088.

[29] Estes, J., Belward, A., Loveland, T., et al., “The Way Forward,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1089-1093.

[30] Kelly, M., Estes, J.E., and Knight, K.A., “Image Interpretation Keys for Validation of Global Land-Cover Data Sets," Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1041-1050.

[31] De Boissezon, H., Gonzales, G., Pous, B., and Sharman, M., “Rapid Estimates of Crop Acreage and Production at a European Scale Using High Resolution Imagery—Operational Review,” Proceedings of the International Symposium on Operationalization of Remote Sensing, Enschede, The Netherlands, 19-23 April, 1993, International Institute for Aerospace Survey and Earth Sciences, Enschede, 2:94-105.

[32] Husak, G.J., Hadley, B.C., and McGwire, K.C., “Landsat Thematic Mapper Registration Accuracy and its Effects on the IGBP Validation,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1033-1040.

64 CHAPTERS

REVIEW OF ACCURACY STANDARDS

Quantitative assessments of accuracy are necessary to decide whether observed

differences between results reflect more than experimental variability, whether tests

comply with specifications, or whether laws based on limits have been broken. Without

information on uncertainty, there is a real risk of over- or under-interpreting the results,

resulting in incorrect decisions. Decisions made on such a basis can result in unnecessary

expenditure in government or industry, incorrect legal prosecution, or adverse health or

social consequences.’

This chapter reviews two types of existing and emerging standards related to

accuracy in measurements: (1) general standards for uncertainty in measurement, and (2)

geospatial data standards. The main purpose of the review is to seek out concepts on repeatability and reproducibility that may be applicable towards reliable thematic classiEcation of SAR/InS AR data, as well as critique recent or pending releases of geospatial data standards in light of the measurement standards. Both national and international standards are reviewed.

Section 3.1 reviews four general measurement standards: (1) the Guide to the

Expression of Uncertainty in Measurement {GUM)', (2) ISO 5725-2 Accuracy (trueness and precision) of Measurement Methods and Results—Part 2: Basic Method for the 65 Determination of Repeatability and Reproducibility of a Standard Measurement Method;

(3) NIST Technical Note 1297 Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results; and (4) ISO TC69/SC6AVG7 Statistical Assessment of the Uncertainty of Measurement Results: Guide to the Use of Repeatability,

Reproducibility and Trueness Estimates in Measurement Uncertainty Estimation.

Section 3.2 on geospatial data standards also reviews four standards: (1) the

Spatial Data Transfer Standard; (2) the Content Standard for Digital Geospatial Metadata;

(3) Content Standard - Extensions for Remote Sensing Metadata; and (4) the ISO/ TC

211 Draft Standard for Geographic Information/Geomatics. These standards are of interest because they are intended to prescribe uniform ways in which to describe geospatial data. Although the evaluation of geospatial data is complex and multi-faceted

[1], we will examine whether important concepts set forth in the general measurement standards may be lacking in the geospatial data standards.

Finally, Section 3.3 summarizes the main observations and potential uses of the standards as they relate to the problem under consideration.

3.1 General Measurement Standards

Our primary interest is in examining the methodology to promote repeatability and reproducibility in the measurement of physical quantities. These concepts are important in the current study because, at its foundation, a S AR/InS AR system simply measures a set of physical quantities. After measurement, the values representing the physical quantities undergo myriad transformations to produce the data used in thematic classification. Since methodologies are available to promote repeatability and

66 reproducibility in measuring other physical quantities, perhaps the same methodologies

can also be applied to achieve repeatable and reproducible results from SAR/InS AR data.

These concepts will be examined below through the study of four relevant standards.

3.1.1 Guide to the Expression of Uncertaintv in Measurement

The Guide to the Expression o f Uncertainty in Measurement [1] was published by

ISO in 1993 through the joint efforts of an international confederation of organizations, to

produce a uniform method for evaluating and expressing uncertainty throughout the

world. A uniform method was designed in order to allow measurements performed in different countries to be easily compared. GUM is primarily concerned with the expression of uncertainty in the measurement of a well-defined physical quantity—the measurand—that can be characterized by an essentially unique value. The phenomenon of interest must then be represented as a distribution (or dispersion) of values, or is dependent on one or more parameters (such as time).

GUM is applicable to evaluating and expressing the uncertainty associated with the conceptual design and theoretical analysis of experiments, methods of measurement, and complex components and systems. It presents general guidelines, which may then be adapted to specific fields of science and engineering as needed. Some of its basic precepts were introduced in Chapter 1; here they will be expanded.

According to GUM, the objective of a measurement is to determine the value of the measurand, that is, the true value of the particular quantity to be measured. A measurement begins with an appropriate specification of the measurand, the method of

67 measurement, and the measurement procedure. Generally, the result of a measurement is

only an approximation or estimate of the tme value of the measurand, and so the

measurement must be accompanied by a statement of its uncertainty.

In many cases, the result of a measurement is determined on the basis of series of

observations that are obtained under repeatability conditions. A mathematical model of

the measurement that transforms the set of repeated observations into the measurement

result is critical, because (in addition to the observations) it usually includes various

inexactly-known quantities that influence the result. This lack of knowledge contributes

to the uncertainty of the measurement result, as well as variations in repeated

observations and uncertainty due to the mathematical model itself.

GUM treats the measurand as a scalar (a single quantity). To extend the concepts

to a set of related measurands measured simulaneously in the same measurement, the

scalar measurand and its variance must be replaced by a vector measurand and a

covariance matrix. An error is viewed as having two components: a random,

unpredictable component that results in variations in repeated measurements, and a

systematic component that can be corrected by applying a correction factor. Other

sources commonly attribute an error as having three components: the two mentioned

above, plus blunders, which are erratic mistakes made during the measurement and

measurement reduction process.

Many sources of uncertainty are possible. Sources of uncertainty that require

special scrutiny in relation to SAR/InSAR classification include: incomplete definition of the measurand; imperfect realization of the definition of the measurand; the sample

68 measured may not represent the defined measurand; and inadequate knowledge of the

effects of environmental conditions on the measurement or imperfect measurement of

environmental conditions. These sources will be further discussed in later chapters.

When all of the quantities on which a measurement result depends are varied, the

results can be evaluated statistically. However, because of limited time and resources the

uncertainty is usually evaluated with a mathematical model of the measurement and the

law of propagation of uncertainty. Thus, the Guide assumes that the model is sufficient

to describe the measurement, to the required degree of accuracy. Because the

mathematical model may be incomplete, it is important to vary all of the relevant

quantities as much as possible, so that the evaluation of uncertainty can be based on observed data. The procedure for evaluating and expressing uncertainty is summarized in

Table 3.1.

69 1. Express the mathematical relationship between the measurand V and the input quantities Xf, on which V depends, as F = f(X j, X 2 , Xv). The function/ should contain all of the quantities (including correction factors) that contribute a significant component of uncertainty to the measurement result.

2. Determine x,- (the estimated value of input quantity X,) on the basis of a statistical analysis of series or other means.

3. Evaluate the standard uncertainty fi(x[) for each input estimate x,. (a) When the input estimate is obtained from a statistical analysis of series of n independent, repeated observations under the same conditions of measurement, this is expressed by the estimated standard deviation equal to the positive square root of the estimated variance «/. (b) When the input estimate is obtained by other means, fu(Xi) is evaluated by scientific judgment based on a pool of information on the variability of X,.

4. Evaluate the covariances for any input estimates that are correlated.

5. Calculate the result of the measurement (i.e., the estimate y of the measurand Y) from the relationship/using the estimates x,- found in step 2.

6. Determine the combined standard uncertainty /Xc by combining the individual standard uncertainties ^,-(and covariances as applicable) using the root-sum-of-squares or other equivalent method.

7. If it is necessary to give an expanded uncertainty U for the purpose of providing a confidence interval y—Uloy + U, then the combined standard uncertainty n^y) can be multiplied by a coverage factor k (typically between 2 and 3) to obtain U = kfic(y)-

8. Report the result of the measurement y along with its combined standard uncertainty /t/y) or expanded certainty U. Also describe how y and fXc(y) or U were obtained.

Table 3.1. Summary of Procedure for Evaluating and Expressing Uncertainty (after [1])

70 Other considerations of interest are as follows:

• The m easurand cannot be specified by a value, but only by a description of a

quantity. An incomplete description of the measurand introduces further

uncertainty in the result of the measurement. There can be many values of the

measurand if the definition of the measurand is incomplete.

• Determining the same measurand by different methods or in dlRerent

laboratories can often provide valuable information about the uncertainty

related to a certain method, and can also help identify systematic effects.

• Many measurements involve comparing an unknown object with a known

standard having similar characteristics, in order to calibrate the unknown. In

such cases the measurement methods are not especially sensitive to sample

selection, since the unknown and the standard respond in generally the same

(and often predictable) ways to the variables.

Natural materials are often inbomogeneous,leading to two additional

uncertainty components. Evaluating the first component requires a

determination of how well the selected sample represents the parent material.

Evaluating the second requires a determination of the extent to which the

unanalyzed constituents influence the measurement, and how adequately they

are treated by the measurement method.

• An incomplete knowledge of influence quantities and their effects can

contribute significantly to the uncertainty of a measurement result.

71 • Clearly describe the methods used to calculate the measurement result and

its uncertainty from the experimental observations and input data.

• List all uncertain^ components and fully document how they were

evaluated.

• Repeatability —the closeness of agreement between the results of successive

measurements (of the same measurand) carried out under the same conditions

of measurement. Results may be expressed quantitatively in terms of the

dispersion of results. The conditions of repeatability include:

the same measurement procedure

the same observer

the same measuring instrument used under the same conditions

the same location

repetition over a short period of time.

• Reproducibility —the closeness of agreement between the results of

measurements (of the same measurand) carried out under changed conditions

of measurement—may be expressed quantitatively in terms of the dispersion

of results. A valid statement of reproducibility requires that any changed

conditions be specified, which may include:

principle of measurement

method of measurement

observer

- measuring instrument

reference standard

72 location

conditions of use

time

In Chapter 4 these conditions and guidelines will be discussed and applied towards the

formulation of a theory and methodology for reliable (repeatable and/or reproducible)

measurements for SAR/InSAR thematic maps.

3.1.2 ISO 5725-2 Accuracy (trueness and precision) of Measurement Methods and Results—Part 2: Basic Method for the Determination of Repeatability and Reproducibility of a Standard Measurement Method

The international standard ISO 5725-2 [2] was prepared by Technical Committee

ISO/TC 69, Applications of Statistical Methods, Subcommittee SC 6, Measurement

Methods and Results. ISO 5725 uses the terms “trueness” and “precision” to describe

the accuracy of a measurement method. Here “trueness” refers to how closely a large

number of test results compare with the arithmetic mean, and “precision” refers to the

closeness of agreement between test results. This part of ISO 5725 is concerned only

with estimation, through the use of the repeatability standard deviation and the

reproducibility standard deviation. It relates only to measurements on a continuous scale

that give a single value as the test result (even when the single value is the outcome of a

calculation from a set of observations).

The standard describes the requirements for a precision experiment. In the basic method, samples from q batches of materials (representing q different levels of the test), are sent to p laboratories. Each laboratory obtains exactly n replicate test results under repeatability conditions at each of the q levels. A “laboratory” is a specific combination of an operator, the equipment, and the test site. Thus, a number of laboratories can exist

73 at a given site, so long as the equipment and situations are independent. In each laboratory the measurements are to be carried out by one operator who is representative of those who would perform the measurements in normal operations. The measurements must be performed under the following eight conditions:

1. Equipment must first be checked as specified by the standard method.

2. Each set of n measurements are to be carried out under repeatability

conditions: that is, within a short time interval by the same operator and

without intermediate calibration of equipment (unless that is an integral part

of the measurement procedure).

3. Each group of n tests under repeatability conditions must be performed

independently, as if they were n tests on different materials. The purpose is to

determine what differences can occur in results in actual testing.

Consideration should be given to coding the sample so that the operator will

not know which are replicates for a given level, so that previous test results do

not influence later ones.

4. A set of q different groups of n measurements may be carried out on different

days.

5. Measurements of all q levels must be performed by the same operator;

furthermore, the n measurements at a given level must be performed using the

same equipment throughout.

6. An operator change within a group of n measurements at one level is not

permissible. However, if an operator change takes place between two of the q

groups, this change must be reported in the results.

74 7. All measurements are to be completed within a given time limit. This limits

the time between the day the samples are received and the day they are

measured.

8. All samples must be labeled clearly, with the name of the experiment and a

sample identification.

The end result is to derive an estimate of the reproducibility variance Sr as follows:

2 9 ? = SiT + where s\} is the estimate of the between-laboratory variance; is the estimate of the within-laboratory variance; and s / is the arithmetic mean of sw^ and the estimate of the repeatability variance. This mean is taken over all of the laboratories that participate in the accuracy experiment, which remain after the outliers are removed.

The reproducibility standard deviation is the standard deviation of test results obtained under reproducibility conditions. A panel of experts familiar with the measurement method and its application should plan, coordinate, and evaluate the experiment, and establish the final results.

3.1.3. NIST Technical Note 1297 - Guidelines for Evaluating and Expressing the Uncertaintv of NIST Measurement Results

GUM (Section 3.1.1) was adopted by the U.S. National Institute of Standards and

Technology and re-presented in succinct form in 1994 as Technical Note 1297 of the U.S.

National Institute of Standards and Technology (NIST), “Guidelines for Evaluating and

Expressing the Uncertainty of NIST Measurement Results”[3]. NIST has put forth these standards, not only for the guidance of NIST measurement results, but also for measurement results associated with basic research, applied research and engineering, calibration and certification, and other uses. The guidelines are summarized below: 75 The uncertainty of a measurement result generally consists of several

components, which may be grouped into two categories according to the

method used to estimate their numerical values: Type A—those evaluated by

statistical methods, and Type B—those evaluated by other means.

Each component of uncertainty that contributes to the uncertainty of a

measurement result is represented by an estimated standard deviation, termed

standard uncertainty with suggested symbol «„ and equal to the positive

square root of the estimated variance m/.

An uncertainty component in category A is represented by a statistically estimated standard deviation s,-, equal to the positive square root of the statistically estimated variance 5/, and the associated number of degrees of freedom v,. For such a component the standard uncertainty is u, = s,-. The evaluation of uncertainty by the statistical analysis of series of observations is termed a Type A evaluation of uncertainty.

In a similar way, an uncertainty component in category B is represented by a quantity uj, which is an approximation to the corresponding standard deviation. It is equal to the positive square root of u/^ (an approximation of the corresponding variance) and is obtained from the assumed probability distribution based on all the available information. The evaluation of uncertainty by any means other than a statistical analysis of series of observations is termed a Type B evaluation of uncertainty.

Correlations between components of either category are characterized by estimated covariances or estimated correlation coefficients.

76 • The combined standard uncertainty of a measurement result (suggested

symbol «c) is taken to represent the estimated standard deviation of the result.

It is obtained by combining the individual standard uncertainties ui (and

covariances as appropriate), whether arising from a Type A or Type B

evaluation, using the law of propagation of uncertainty. This is also known as

the “root-sum-of-squares” (square root of the sum of the squares) or “RSS”

method.

• In cases where a measure of uncertainty is required that defines an interval of

the measurand Y, within which the measurement result y is confidently

believed to be (such as for health and safety concerns), the expanded

uncertainty U is used. It is obtained by multiplying udy) by a coverage factor

k, which is generally between 2 and 3, for example to define either a 95% or

greater than 99% level of confidence. Thus U = kui^y), with confidence that

y—U

• When a correction factor is applied to compensate for a systematic effect,

every effort should be made to identify the effect.

3.1.4 ISO TC69/SC6/WG7 — Statistical Assessment of the Uncertaintv of Measurement Results: Guide to the Use of Repeatabilitv. Reproducibility and Trueness Estimates in Measurement Uncertaintv Estimation (Draft)

ISO TC69/SC6/WG7^ is a developing draft standard that merges elements of both

ISO 5725 Part 2 (1994) and the Guide. While the Guide is a widely adopted standard approach, it is criticized for the absence of a comprehensive model of the measurement process. On the other hand, ISO 5725 Part 2 offers a very wide range of standard test methods, subjected to collaborative study. This developing draft document offers “an 77 appropriate and economic methodology for estimating uncertainty for the results of these

methods which complies fully with the relevant BIPM [sampling] principles whilst taking

advantage of method performance data obtained by collaborative study.”*

The document covers (1) how to compare collaborative study results, using measurement uncertainty as obtained from uncertainty propagation; (2) evaluating measurement uncertainties using data from collaborative studies; and (3) evaluating measurement uncertainties using intermediate measures of precision, especially in-house reproducibility over time. The document is applicable in all measurement and test fields for which an uncertainty for a result needs to be determined. The uncertainty of a measurement is usually comprised of many components.

The document repeats GUM in the main topics on the uncertainty of measurement, standard uncertainty, combined standard uncertainty, expanded uncertainty, coverage factor k and other concepts. It adds a statement that an uncertainty budget should be compiled, in order to evaluate a combined standard uncertainty for a measurement result. This is to include a list of sources of uncertainty and their associated standard uncertainties.

Reproducibility is defined as precision under reproducibility conditions, i.e. conditions where test results are obtained with the same method, on identical test items, in different laboratories, with different operators, using different equipment. Note: A valid statement of reproducibility requires specification of the conditions changed.

Reproducibility may be expressed quantitatively in terms of the dispersion of the results.

In general, GUM's approach requires that the x, are measured quantities, in a measurement traceable in every respect to SI units. When effects cannot be readily

78 defined in measurable quantities (such as operator effects), the draft recommends the

addition of combined standard uncertainties u(xi) to allow for these effects, or to

introduce additional variables into^Jc/,X 2,...a:„). Because the focus is on individual input

quantities, this is sometimes called a “bottom-up” approach to evaluating the uncertainty.

When an approach (such as the collaborative study approach discussed in ISO 5725 on

the estimated reproducibility standard deviation s r ) focuses on the performance of the

entire method, it is often called a “top-down” approach.

Accordingly, the first principle of the draft standard is that the reproducibility standard deviation that is obtained in a collaborative study is a valid basis fo r evaluating the measurement uncertainty. The second principle is that any effects not observed in the context o f the collaborative study must be “demonstrably negligible or explicitly allowed fo r”. This assumes that in cases where reproducibility data are used, all laboratories are assumed to be performing similarly; and the test materials are homogenous and stable.

The draft standard then discusses how to reconcile two apparently different approaches to the evaluation of uncertainty in GUM and ISO 5725. In the GUM approach, uncertainty is predicted in the form of a variance on the basis of inputs to a mathematical model. In ISO 5725, if influences vary during a reproducibility study, then the observed variance becomes the estimate of uncertainty. The draft notes that in practice, the uncertainty values found in the two approaches vary for a number of reasons, including (I) incomplete mathematical models (i.e., the presence of unknown effects), and (2) incomplete or unrepresentative variation of all influences during reproducibility assessment. Therefore, comparing the two different estimates is useful to assess the completeness of the measurement models. Accordingly, it is important that the

79 deficiencies in each approach are remedied. The draft therefore suggests the use of a

hybrid approach, combining elements of the “top-down” and “bottom-up” evaluations, in

order to address the possibilities of both model uncertainties and inadequate variation of

input effects.

The recommended procedure is then as follows:

(i) Obtain estimates of the repeatability, reproducibility and trueness of the

method in use from published information about the method.

(ii) Determine whether the laboratory bias for the measurements is within that

expected on the basis of the repeatability and reproducibility estimates

obtained at (i).

(iii) Determine whether the precision obtained by current measurements is

within that expected on the basis of the repeatability and reproducibility

estimates obtained at (i).

(iv) Identify any influences on the measurement that were not adequately

covered in the studies referenced at (i). Quantify the variance that could

arise from these effects, taking into account the uncertainties in the

influence quantities and the sensitivity coefficients.

(v) Where the bias and precision are under control (as demonstrated in (ii) and

(iii), combine the reproducibility estimate at (i), the uncertainty for

trueness from (i) and (ii), and the effects of additional influences

quantified at (iv), and form a combined uncertainty estimate.

The collaborative study results fi-om a set of performance figures {sr, Sr, and in some cases, a bias estimate), which form a specification for how well the method

80 performs. When a method is adopted by a laboratory, that laboratory is expected to demonstrate that it is meeting the specification. This is usually achieved by studies for verifying control of precision and bias, and by continued performance checks for quality control and assurance. A bias check is a comparison between laboratory results and reference value(s). Ideally, the uncertainty associated with the bias check should be small, less than 0.2j/?. Comparison with a reference standard should be performed under repeatability conditions, and the range of error should be < 2ob, where is estimated by

S d ’-

Sd ^ = sl + (s w /n i) where ri[ is the number of replicates, the within-laboratory standard deviation derived from the replicates (or other repeatability studies), and the between-laboratory standard deviation. A suitable number of test items should be tested, using both the reference method and the test method, and a significance test performed on the result. In addition to preliminary estimation of bias and precision, the laboratory should ensure that the measurement procedure continues in a state of statistical control, including regular checks on bias and precision using any relevant stable, homogeneous test item or material and quality assurance measures.

Inhomogeneity studies require special consideration via experimental studies, which can yield a variance estimate (usually from ANOVA—Analysis of Variance) of replicate results on several test items. The between-item component of variance Si„h^ represents the effect of inhomogeneity.

81 Several main points summarize the foregoing discussion:

(1) Before adopting and using a given test method, the factors or sources that

influence the accuracy of the test method need to be evaluated. This is

important, since the accuracy of the test method affects the uncertainty of the

test results. This is done from experiments conducted in different

laboratories, designed and conducted according to ISO 5725. In this way, the

precision of the test method is established when applied to a routine set of

tests. From this interlaboratory study, the repeatability standard deviation and

the reproducibility standard deviation are determined.

(2) When no reference material is available, the evaluation of trueness (that is,

controlling the bias against a reference) presents methodological and technical

questions. In such cases, trueness must be controlled by calibration of the test

system through comparison with reference quantities by whatever means

possible. However, in such cases, the uncertainties associated with such

calibrations must be verified to be very much less than the reproducibility

standard deviation. In this way the bias may be controlled.

When it can be determined that the repeatability (as demonstrated by repeated test runs) is within the repeatability found in the interlaboratory study, then in such cases, the precision is considered to be under good control, and the value of the reproducibility standard deviation is used to estimate the uncertainty standard deviation.

3.2 Geospatial Data Standards

Next a set of four geospatial data standards will be described, and evaluated at the end of the chapter in light of the measurement standards discussed in Section 3.1.

82 3.2.1 Spatial Data Transfer Standard

The Spatial Data Transfer Standard (SDTS) evolved out of the need for an ability to easily transfer geospatial data between different types of hardware and software systems and between various data formats, while minimizing loss of data in the exchange. It can be noted that SDTS offers an essential framework not only for enabling data transfer, but also for standardizing the way in which any spatial data of interest are described and documented.

In 1992, after twelve years of development spearheaded by the U.S. Geological

Survey, the resulting standard was approved as Federal Information Processing Standard

(PIPS) Publication 173. (This was superceded in 1998 by ANSI NCITS 320-1998.)

Compliance with this standard is now mandatory for federal agencies. SDTS is described as: “.. .a transfer standard that embraces the philosophy of self-contained transfers, i.e. spatial data, attribute, georeferencing, data quality report, data dictionary, and other supporting metadata included in the transfer.” [4]

According to the standard, the six parts of SDTS are: logical specifications, spatial features, ISO 8211 encoding, topological vector profile, raster profile, and point profile. In Part I, Section 3, specifications for data quality are addressed in Spatial Data

Quality [5], defined here as “fitness of use”. The purpose of this section is to:

“...provide detailed information to a user to evaluate the fimess for a particular use. This style of standard can be characterized as “truth in labeling,” rather than fixing arbitrary numerical thresholds of quality. To implement this portion of the standard, a producer is urged to include the most rigorous and quantitative infor­ mation available on the components of data quality described in this section.” [6]

The components o f data quality include lineage, positional accuracy, attribute accuracy, logical consistency, and completeness, described below.

83 Lineage is defined as the history portion of the dataset. The lineage portion describes the material from which the data were derived as well as the methods of derivation, along with all transformations and control information. The dates of the source information and the dates of ancillary information must also be included.

Positional accuracy is defined as the nearness of a real world entity to its true position, in terms of an appropriate coordinate system. The positional accuracy portion includes the degree of compliance to a spatial registration standard, and must consider the quality of the final product after all transformations. Any tests of positional accuracy and their results must be reported as well as the dates of the tests. The preferred test is to compare the product with an independent source of higher accuracy, following the rules in the ASPRS Accuracy Standards for Large Scale Maps [7]. Variations are to be reported either as additional attributes for each spatial objects, or through a quality overlay such as a reliability diagram.

An attribute is defined as a fact about a location, set of locations, or feature on the surface of the earth, while accuracy can be defined as the difference between a measurement (or attribute) and a comparable measurement known to be of a higher accuracy. The attribute accuracy portion for measures on a continuous scale must provide a numerical estimate of expected discrepancies, using procedures similar to those used for positional accuracies. The date of the tests and the date of the materials used must be included, along with the map scales of the respective sources. When different dates are involved, the report must describe the rates of change in the phenomenon as classiried. Spatial variations in attribute accuracy must also be reported in a quality overlay. It should be noted that in some cases, one dataset may treat something as an

84 attribute, while another may encode the same thing directly in the data. An example of this is contours; in one dataset the elevation is attached as an attribute to 2-D topology; while another represents the contours using X,Y,Z coordinates. Fundamental differences such as these make the notion of seamless data transfer a difficult one to implement in some cases.

Logical consistency describes how well relationships are encoded in the data structure of the digital spatial data. Any tests performed and their results must be detailed. Types of tests may include: tests of valid value; tests for graphic data

(intersection, overshoots, tolerances, etc.); and topological tests (for area coverages entered or derived from chains). The dates of any tests must be provided, and if corrections were performed, a description must be given on how the new information was checked for logical consistency.

Completeness can be defined as an attribute that describes the relationship between objects represented in a data set and the abstract universe of all objects [8]. The completeness portion describes the selection criteria, definitions used, and other pertinent mapping rules (e.g., geometric thresholds such as minimum area of minimum width).

Standard geocodes must be used if possible, and any deviations from standard definitions and interpretations must be described. This portion should also describe the relationship between the objects represented as well as the spatial and attribute properties of a set of features. The testing procedures used and their results must be described.

To summarize, SDTS describes data quality as having five components: lineage, positional accuracy, attribute accuracy, logical consistency, and completeness. Each component must be thoroughly defined, described, compared against a standard (when a

85 relevant standard exists), and tested. It is envisioned that the same principles that enable

data transfer will also enable data standardization, by thoroughly describing the product,

the means by which it was created, and its accuracy.

3.2.2 Content Standard for Digital Geospatial Metadata

The Content Standard for Digital Geospatial Metadata (CSDGM) was developed

so that prospective users could determine, from a central catalogue of geospatial datasets,

which datasets may be appropriate to address a particular set of requirements. The

CSDGM provides federal and other participating organizations the means for uniformly

cataloguing geospatial datasets, in order that they can be referenced in the National

Geospatial Data Clearinghouse. Geospatial datasets that follow this standard allow a user

to determine the dataset availability, fitness for an intended use, and the means for

accessing and transferring the data. The standard was developed through the oversight of

the Federal Geographic Data Committee (FGDC), and its use by federal agencies was

mandated in 1994 through Executive Order 12906, “Coordinating Geographic Data

Acquisition and Access: The National Spatial Data Infrastructure.” [9]

The term “metadata” means “data about data,” or, as more formally defined in the

CSGDM, “data about the content, quality, condition, and other characteristics of the data”

[10]. Metadata also establishes a uniform methodology for recording information about a

set of data, thereby facilitating an environment for electronic searches. A parallel can be

seen in an old-fashioned public library, in which many thousands of books and periodicals occupy a large amount of floor space. The volumes are referenced in a central card catalogue, which may list the ISBN number, author, title, subject, location

(such as the name of the library branch or area within the same library), and other items

86 of interest. In geospatial data, the books and periodicals correspond to the many

thousands of geospatial datasets (e.g., the U.S. Fish and Wildlife’s National Wetlands

Inventory database). The card catalogue corresponds to metadata, which is centrally

accessible (i.e., on intemet) to a user, and tells the user what the dataset is comprised of,

where it can be found, and other useful information. In the CSDGM, the main

components of metadata are: [10]

• Identification information

• Data quality information

• Spatial reference information

• Entity and attribute information

• Distribution information, and

• Metadata reference information

It can be noted that some of the main components of CSDGM roughly correspond with the main components of STDS, while others do not.

The part of greatest interest for our purposes is the section on data quality information. The data quality section specifies the use of six descriptors [10]: attribute accuracy, logical consistency, completeness, positional accuracy, lineage, and cloud cover. In contrast with differences in the descriptions in the main components in the

FGDC and CSDGM standards, here the first five descriptors correspond exactly to the five descriptors and their definitions in Section 3.1 of SDTS as described earlier, plus the sixth descriptor of cloud cover has been added. Then, within each of the six descriptors, very specific ways are prescribed to represent the information. Further details may be found in [10].

87 3.2.3 Content Standard for Digital Geospatial Metadata and Extensions for Remote Sensing Metadata

In 1998 the CSDGM was updated in Version 2 (FGDC-STD-001-1998), which described how geospatial data communities could develop profiles, that is, specific types of “mini-standards”—of the base CSDGM standard. [11] A standard of interest to our efforts was drafted and made available for public review through August 31, 2001. This document is the Content Standard for Digital Geospatial Metadata: Extensions for

Remote Sensing Metadata (Public Review Draft). It offers a set of standards to describe metadata for the subset of geospatial data that are obtained from remote sensing. Since

SAR/InSAR data can be described as products derived from one type of remote sensing, this metadata standard is pertinent to the topic under study, summarized below.

The purpose of Extensions for Remote Sensing Metadata [12] is to provide a common terminology and set of definitions for documenting geospatial data obtained by remote sensing, within the framework of the FGDC’s (1998) Content Standard for

Digital Geospatial Metadata. The standard was developed by the Imagery Subgroup of the FGDC Standard Working Group, with support from government, industry, and the academic community. (The standard is maintained by the NASA Earth Science Data and

Information System Program for the FGDC.) The primary remote sensing metadata descriptors are: identification information, data quality information, spatial data organization information, spatial reference information, entity and attribute information, distribution information, metadata reference information, platform and mission information, and instrument information. The two descriptors shown in italics are new additions to the basic FGDC standard; the others were described earlier as part of the basic standard. However, all are modified to reflect the special characteristics of remote 88 sensing data. Platform and mission information includes: mission description and history; platform sponsor(s), description, and orbit; and flight protocol. The instrument information includes: an instrument description (such as instrument type, hardware, orientation, properties, frame optics, calibration, and distortions); and instrument reference (detailed citation information).

The remote sensing metadata standard is essentially a highly detailed list, prescribing exactly what information is to be included in each of the 9 categories of metadata descriptors, as well as the order in which it is to be entered. After a brief introduction, about100 pages of (mostly) one-line, single-spaced descriptors follow, for which the dataset creator is expected to “fill in the blanks.” To give an idea of the level of detail involved, we look at a typical sub-level of description, optics. The following entries are requested under this one sub-level:

• photographic resolving power (comprised of number of resolution values,

resolution value settings, the area weighted average resolution);

• resolution value set (comprised of resolving angle, resolving value radial, and

resolving value tangential);

• last calibration (comprised of date of last calibration, method of last

calibration, and institution of last calibration);

• relative aperture;

• exposure time;

• calibrated focal length;

• quality of the focal length.

89 Thus entry of a total of 13 parameters (including those in parentheses) is suggested for

one sub-level of a single description; and many require considerably more than this!

(Although not all descriptors are mandatory, many are, causing one to wonder whether

the level of required detail is too extensive to be practical—or has the level of complexity

escalated beyond our means to assess the influence of all of these factors?)

3.2.4 ISO/ TC211 Draft Standard for Geographic Information/Geomatics.

The International Organization of Standardization (ISO) is a worldwide federation

of national standards bodies (ISO member bodies). Its technical committees are involved

in preparing international standards. The ISO/TC211 Committee on Geographic

Information/Geomatics was founded in November 1994, and the U.S. has been involved

with the TC211 Committee via the ANSI/NCTTS LI Committee on Geographic

Information^. In many respects, their work resembles that of SDTS, but is expanded in

some areas and simplified in others.

ISO/TC211 documents are being published under the title of ISO 191XX.

Geographic Information, where XX corresponds to sub-documents 1 through 30. This is

a multi-part international standard that is intended to be used to build standards for

specific application domains, which will allow data interchange and inter operating

between such domains [13]. At this time, many parts of the standard are still in draft form. The drafts of special interest are 19113 Quality Principles and 19124 Imagery and

Gridded Data Components^. Each is briefly discussed below in relation to our topic.

19113 Quality Principles [14] establishes the principles for describing the quality of geographic data and specifies components for reporting quality information, and it also provides an approach to organizing information about data quality. As with SDTS, the

90 international standard does not attempt to define a minimum acceptable level of quality

for geographic data. Quality is defined as “the totality of characteristics of a product that bear on its ability to satisfy stated and implied needs,” while accuracy is defined as “the closeness of agreement between a test result and the accepted reference value.” The document states that the quality of a dataset can only be assessed by knowing the data quality overview elements and the data quality elements. Data quality overview elements provide qualitative information on the purpose, lineage, and usage of the data set. The data quality elements and sub-elements are shown in Table 3.2.

1. Completeness - commission - omission 2. Logical Consistency - conceptual consistency - domain consistency - format consistency - topological consistency 3. Positional Accuracy - absolute or external accuracy - relative or internal accuracy - gridded data position accuracy 4. Temporal Accuracy - accuracy of a time measurement - temporal consistency - temporal validity 5. Thematic Accuracy - classification correcmess - non-quantitative attribute correctness - quantitative attribute accuracy

Table 3.2. Data Quality Elements and Sub-Elements

91 For each sub-element, seven descriptors of data quality are prescribed:

• scope - dataset series of which dataset is a part

• measure - name and description of data quality test to be used

• evaluation procedure - methodology used to perform quality evaluation

• result - outcome of the evaluation

• value type - such as “pass-fail” for a boolean test

• value unit - the unit used for the evaluation test, and

• date - of quality measure.

The results of the data quality evaluation are presented in a quality evaluation report.

Typically, a random sample of files is selected from various coverages to ensure cartographic and attribute completeness for all of the data quality elements and sub­ elements, for each thematic layer. Alternatively, if part of the data quality evaluation is automated, then that part may be applied to the entire dataset.

19124 Imagery and Gridded Data Components [15] is another ISO/TC 211 draft standard, which presents consistent methods for reporting the quality of geographic information, and also discusses the need to report on the quality of evaluation itself. A standardized set of evaluation criteria and procedures is intended to allow the relative quality of one data set to be determined against another.

The draft is intended to standardize concepts to describe and represent imagery and gridded data, including rules for application schema, quality principles and evaluation procedures, spatial reference systems, visualization, and exploitation services.

92 Five component areas of imagery and gridded data were identified: (1) data model (or schema), (2) metadata, (3) encoding, (4) services, and (5) spatial registration. Each component is described in three levels of increasing detail. In the interest of brevity, only the first level of detail is shown in Table 3.3.

A quality evaluation component (quantitative or qualitative, as appropriate) should be formulated and applied to each of the following sources of error:

1. Acquisition

a. geometric aspects

b. sensor systems

c. platforms

d. ground control

e. scene considerations

2. Data processing

a. geometric rectification

b. radiometric rectification

c. data conversion

3. Data analysis

a. quantitative analysis

b. classification system

c. data generalization

4. Data conversion

a. raster to vector

b. vector to raster

93 Catesorv Level 1

Data M odel/ Data M odel Schema

Metadata Platform Information Sensor Information Sensor Calibration Lineage Geolocation/Geocoding Product Attributes Data Dictionary Encoding Description Data Security Information On-line Documentation

Encoding Encoding Rules File Structure

Services Statistics and Histogram Calculation Image Mosaic Coverage Topology N oise Removal Systematic Radiometry Corrections Systematic Geometry Corrections Subsetting and Subsampling Multi-Band Image Manipulation Spatial and Frequency Filters Special Transformations

Spatial Registration

Table 3.3. Level 1 Components of Imagery and Gridded Data, ISO 19124 (draft)

94 However, further detail is not offered in the draft as to how the error assessment is to be

carried out. For additional information on ISO 19124, reference [15] may be consulted.

As with other geospatial standards, ISO 19113 and 19124 are important for outlining the various aspects of accuracy and error, which need to be considered as part of a thorough quality evaluation process.

3.3 Summary and Observations

The general measurement standards present a well-defined methodology for evaluating the quality and uncertainty of measurements in cases where quantitative measures can be used. These standards describe how to evaluate uncertainty using standard quantitative statistical treatments; the use of mathematical models; how to provide adequate variation in the parameters to assess the “goodness” of a mathematical model; and the uncertainties that can be expected across a range of measurements. The standards recommend that the measurement and/or process be experimentally repeated on the same dataset, to determine how closely the results compare with one another (i.e., the repeatability). The standards also recommend that the main parameters of influence in the process be varied, in order to determine their relative influence and the degree to which the parameters can be controlled. The method for establishing reproducibility presents a measure of the consistency of results to within a stated range of uncertainty, across more than one laboratory.

On the other hand, the geospatial data standards present general concepts for describing data in the Spatial Data Tranter Standard (SDTS). Following these concepts and procedures, the Content Standards for Digital Geospatial Metadata (CSDGM) describes procedures for developing content standards for specific applications. The

95 concepts of SDTS and the procedures of CSDGM are then adapted to Kxtensions for

Remote Sensing Metadata. Since SDTS/CSDGM concepts are embedded in the remote

sensing metadata extensions, we can rely on the use of the remote sensing metadata

standard for describing both the data as it is created and the interim processing steps. By

recording the history of the procedures, models, and parameter settings to a high degree

of detail, it should aid in the ability to repeat the process, thereby supporting conditions

of repeatability. [It remains to be seen how much of the 100 pages of fields,

approximately 1 line per field, is necessary information to allow the process to be

repeated.] Although this metadata standard provides a framework for recording detailed

information about a dataset, it does not define how to evaluate the accuracy and quality of

the data.

The ISO 19113 draft document on Geographic Information—Quality Principles

describes the elements of quality that should be of concern in geospatial data:

completeness, logical consistency, positional accuracy, temporal accuracy, and thematic

accuracy. These elements will be considered for use in the new methodology, as

appropriate. Yet, this standard too falls short in not addressing how quality for these

elements can be determined.

ISO 19124 on Geographic Information — Imagery and Gridded Data Components

provides a more detailed description of the components of quality and accuracy that

should be evaluated. Each of the five elements of quality in ISO 19113 should be evaluated according to five components: the data model, metadata, encoding, services,

and spatial registration. The importance of evaluating the quality and accuracy of the evaluation procedure itself is also emphasized. These concepts will also be considered

96 for use in the new raethodoIog>' (as appropriate). However, as with 19113, this standard

describes what elements should be evaluated for quality, but not how an evaluation is to

be conducted.

ISO TC/211 adds to our methodology by defining the process for evaluating the

evaluation o/the method used to assess data quality. While this is a useful process, once

again we are left without a methodology or benchmarks for evaluating the quality and

uncertainty of the data itself.

Perhaps the primary distinction between the two sets of standards— measurement

and geospatial—is the degree of rigor associated with the former. The measurement

standards to a large degree focus on the establishment of and comparison to absolute

standards, and experimental tests are then compared against these standards, using

measures of quality assurance and control. The geospatial standards to a large degree

focus on describing as fully as possible, the data as they currently exist. While for the

measurement standards “truth” is a comparison to an absolute standard that does not

change over time, for the geospatial standards “truth” can be viewed as relative in line

with fitness for use. Certainly, It Is Important to be able to describe fully any

geospatial data as they now exist, but It may be even more important to relate those data to fixed standards. In Chapter 4 we will explore means by which it may be possible to develop fixed standards for geospatial data measurements (and hence for

SAR/InSAR thematic data, as one type of geospatial data), using the more generic measurement standards as a guideline.

97 3.4 Notes and References

Notes

1. Collaborative trial document furnished by Steve Ellison (SLRE@ lgc.co.uk) to Paula Stevenson via e-mail attachment on May 29, 2001. Draft prepared by S.L.R. Ellison, Laboratory of the Government Chemist, United Kingdom, entitled “ISO TC69/SC6/WG7 — Statistical National Institute of Standards and Technology, U.S. Department of Commerce Technology Administration, May 2001, 31 pages.

2. Comments from Harold (Hal) Moellering, Professor of Geography and geospatial standards expert at The Ohio State University, in personal e-mail communication to Paula Stevenson from [email protected] on June 22,2001.

3. Access to drafts was obtained through password authorization granted by Norman C. Anderson of Lockheed Martin, a leader of the U.S. portion of the TC2II effort, via a personal e-mail communication to Paula Stevenson from [email protected] on June 27, 2001. Documents were then obtained from password-protected URL httD://www.statkart.no/isotc2II/dokreg.htm.

References

[1] International Organization of Standardization, Guide to the Expression of Uncertaintv in Measurement, a collaborative international effort by the International Bureau of Weights and Measures (BIPM), the International Electrotechnical Commission (lEC), the International Federation of Clinical Chemistry (IFCC), the International Organization of Standardization (ISO), the International Union of Pure and Applied Chemistry (lUPAC), the International Union of Pure and Applied Physics (lUPAP), and the International Organization of Legal Metrology (OIML), 1995, lOI pages.

[2] International Organization of Standardization. ISO 5725-2 Accuracv fTrueness and Precision) of Measurement Methods and Results—Part 2: Basic Method for the Determination of Repeatabilitv and Reproducibilitv of a Standard Measurement Method. 1994,42 pages.

[3] Taylor, Barry N., and Kuyatt, Chris E., NIST Technical Note 1297 — 1994 Edition: Guidelines for Evaluating and Expressing the Uncertaintv of NIST Measurement Results. National Institute of Standards and Technology, U.S. Department of Commerce Technology Administration, September 1994,22 pages.

[4] Guptill, Stephen C. and Morrison, Joel L., Editors, Elements of Spatial Data Quality. Oxford, U.K.: Elsevier Science Ltd., published on behalf of the International Cartographic Association, 1995.

98 [5] U.S. Geological Survey, ‘What is SDTS?”, URL http://mcmcweb.er.usgs.gov/sdts/whatsdts.html. 3 pages.

[6] U.S. Geological Survey, “SDTS: Spatial Data Transfer Standard — Part 1,” URL http://mcmcweb.er.usgs.gov/sdts/SDTS standard nov97/partlbO 1 .html. 3 pages, and http://mcmcweb.er.usgs.gov/sdts/SDTS standard nov97/partlbll.html. 3 pages.

[7] ASPRS Specifications and Standards Committee, “ASPRS Accuracy Standards for Large Scale Maps,” 1990, as summarized in the Federal Geographic Data Committee’s report # FGDC-STD-007.3-1998 on Geospatial Positional Accuracy Standards, Part 3: National Standard fo r Spatial Data Accuracy, Base Cartographic Data, found at URL http://fgdc. gov/ standards/documents/standards/accurac v/chapter3 .pdf. 1.

[8] Morrison, J., “The Proposed Standard for Digital Cartographic Data,” The American Cartographer, 15, 1988, pp. 129-135.

[9] Federal Geographic Data Committee, “Content Standard for Digital Geospatial Metadata (CSDGM),” http://www.fgdc.gov/metadata/contstan.html. 3 pages.

[10] Federal Geographic Data Committee, Content Standard for Digital Geospatial Metadata - FGDC-STD-001-1998, URL http://www.fgdc.gov/standards/documents/standards/metadata/v2 06986/30/01.

[11] Federal Geographic Data Committee, Content Standard for Digital Geospatial Metadata: Extensions for Remote Sensing Metadata, URL http://www.fgdc.gov/standards/status/csdgm rs ex.html , 2 pages.

[12] Standards Working Group, Federal Geographic Data Committee, Content Standard for Digital Geospatial Metadata: Extensions for Remote Sensing Metadata (Public Review Draft), December 21, 2000, pp. v-vi., 11, and lines 371-386, found by selecting this item at the URL referenced above in [9].

[13] International Organization of Standardization, “ISO/TC 211 Geographic Information/Geomatics Scope,” last updated April 2, 2001, URL http://www.statkart.no/isotc211/scope.htm. 29 pages.

[14] ISO/TC 211, Draft International Standards ISO/DIS 19113: Geographic Information — Quality Principles, February 22, 2001, URL http://www.statkart.no/isotc211 (restricted access), pp. 1-22.

[15] ISO/TC 211, Draft review summary from stage 0 of project 19124: Geographic Information — Imagery and Gridded Data Components, December 1, 2000, URL http://www.statkart.no/isotc211 (restricted access), 39 pages.

99 CHAPTER 4

THE NEW METHODOLOGY

Whenever decisions are based on analytical results, it is important to have some indication of the quality of the results; that is, the extent to which they can be relied upon for the purpose at hand. Having confidence in data generated outside of one’s own organization can be achieved, by introducing quality assurance measures to ensure that the data provider is capable of generating data of the required quality.

In order to establish the confidence of results, it is essential that a measurement result be traceable to a defined standard such as an SI unit (System of International units, such as meter or gram), reference material or, where applicable, a defined or empirical method. The quality of results should be demonstrated by giving a measure of the confidence that can be placed in the result and by determining its fitness for a given purpose. One useful measure of this is the measurement uncertainty. [1]

Although the concept of measurement uncertainty has been recognized for many years, it was the publication in 1993 of the Guide to the Egression o f Uncertainty in

Measurement (GUM) that formally established general rules for evaluating and expressing uncertainty in measurement across a broad spectrum of measurements. In this chapter the concepts of the guide will be used as a template for developing a preliminary methodology for assessing uncertainty in SAR/InSAR thematic classification. 100 In the following sections of this chapter, the scales of measurement and their

effect on available statistical methods will be discussed. The steps involved in the

evaluation of uncertainty will be summarized, and tools for carrying out the evaluation

will be introduced.

4.1 Determining the Measurand

For the purposes of this study we consider the measurand to be the way in which a

feature is represented by a given model, rather than the feature itself. The reason for this

is because higher accuracy sources used as “tmth” are typically first converted into the

data model representation, then the two versions of the same model (higher accuracy and

lower accuracy) are compared. Thus, for the moment we assume that the model is an

adequate representation of reality, at least to first order (i.e., the main effects are

accounted for), and ignore any differences between reality and the model. The model is

assumed adequate when the difference between modeled results and experimental results

are within an acceptable range of error. The general model used to represent thematic

data is shown in Figure 4.1. The following steps are depicted in the figure: from the real

world, the model typically groups features into areas and labels the areas according to the class definitions, using grid cells as the basic unit. Results are compared with a source

that is more accurate than the measurement dataset. Grid cells that have a different value from the corresponding sampled "truth" dataset are identified, and results are reported in a truth table.

After the example the concept of the scales of measurement will be examined, which plays a significant role in how error (i.e., uncertainty) is expressed statistically, especially in remote sensing classification efforts.

101 R eal W orld (grid superimposed)

Representation (Ideal) Model of World Each of the defîned feature classes is a measurand, 1 1 1 1 1 1 1 1 1 1 1 1 assigned a label L as follows: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 = grass 1 1 1 1 1 1 1 1 1 1 1 1 2 = roads and driveways 2 2 2 1 3 3 1 1 4 4 1 1 3 = buildings 2 2 1 1 3 3 1 1 4 1 1 1 4 = trees 1 1 1 1 1 1 4 4 4 4 1 1 Classification rules (ex.): (1) if a grid square has 1 1 1 1 1 1 4 1 4 1 1 1 more than 50% o f its area occupied by a feature, 1 1 1 1 1 1 1 1 1 1 1 1 then label L is that feature type; and (2) all grid 1 1 1 1 1 1 1 1 1 1 1 1 squares must be assigned one of these four numbers. W e consider this as our “truth” dataset.

The World as Measured 1 1 1 1 1 1 according to the model

1 1 1 1 = Representation Model + Uncertainty

2 1 4 The same classification rules apply to both the “truth” IS 1 (ideal) dataset and the measured dataset. However, due to measurement errors, four of the grid cells are 1 1 1 4 4 1 now classified incorrectly, as shown in gray. The errors occurred because the measurand cannot be 1 1 1 1 1 1 known with total certainty. Note that the truth dataset (above) is more detailed, and hence more accurate, than the measurement model (left).

Figure 4.1. The real world, representation model, and measurement model.

102 4.2 Scales of Measurement

The scale of measurement that is used to evaluate measurement results is important, because the use of one scale over another can have significant implications in how the uncertainty of the measurement may be expressed. The fundamentals of the four scales of measurement; nominal, ordinal, interval, and ratio [2], are discussed below.

A nominal scale is used when a set of features are distinguished (i.e., classified) on the basis of qualitative considerations only, without any implication for a quantitative relationship. Land use/thematic classifications are a example of differentiating land features on a nominal scale, such as forest, meadow, lake, etc.

An ordinal scale includes a nominal scale as a subset, but differentiates within a class of data on the basis of rank according to some quantitative measure. Variables are ordered from highest to lowest, for example, but numerical values are not defined.

Specific magnitudes or differences are not included. Examples are larger and smaller (as in the relative sizes of cities), and hotter or colder.

An interval scale adds the information of distance between ranks. For example, temperature can be differentiated using a standard scale, such as degree (Celsius or

Fahrenheit), or elevation differences can be expressed in standard units of measure such as the meter.

A ratio scale, on the other hand, provides magnitudes that are intrinsically meaningful, by using an interval scale that begins at a zero point. Examples of ratio measures are elevation above a datum and calibrated SAR/InSAR sensor measurements.

103 4 3 Evolution of Nomina! Statistics in Remote Sensing Thematic Maps

The evolution of the statistical treatment of remotely sensed data for land-use (or

thematic) maps is an interesting one. An early reference in the literature (Genderan,

1977) for assessing the accuracy of this type of data describes “a simple statistical

sampling procedure for determining the accuracy of remote-sensing-derived land-use

maps.” [3] The point is made that an interpreter needs to be able to specify the accuracy

of his or her product, but at the same time it is not practical to completely check every

data point. The article references earlier work in 1976 by Genderen and Lock [4] and

Hord and Brooner [5]; in 1974 by Zonneveld [6]; in 1971 by Rudd [7]; and in 1968 by

Stobbs [8]. Based on these earlier works, a technique of stratified random sampling is

assumed as valid. The 1977 article then proposes:

"... a valid statistical sampling design which will test the correctness of the attribution...that is, for any sample point, it should be shown whether the remote sensing attribution to a class within the classification is correct or in error.” [3]

This was followed by the presentation of a matrix comparing the ground truth categories

with the interpreted land use categories. [It should be noted that the use of stratified

random sampling assumes a normal (Gaussian) distribution of classes across the

landscape, which assumption has since been shown incorrect by [9]. Although the

radiometry values assigned to the pixels may have a roughly Gaussian distribution, once

the pixels are classified the distribution is no longer Gaussian. Thus, even the method of

sampling can be called into question.]

Examples of later work in 1979 by Ginevan [10], in 1981 by Fitzpatrick-Lins

[11], in 1982 and 1985 by Aronoff [12,13,14], and in 1986 by Rosefield and Fitzpatrick-

Lins [15], admit shortcomings in the statistical procedures proposed in earlier papers.

104 The authors attempt to improve the statistical treatment, by expanding the truth table

concept through the addition of acceptance sampling, improved random sampling using

computers, errors of commission and omission, map accuracy reports, the minimum

accuracy value, and Kappa and conditional Kappa coefficients. New users have since

been taught these and similar methods, which continue as standard practice to the present day for evaluating the accuracy of all types of classified, remotely-sensed imagery.

4.4 Comparison of Categorical (Nominal) and Interval Statistics

It should be noted that remotely sensed data are initially measured at interval or ratio scales. For example, signal return from S AR/InSAR is simply an analog electromagnetic wave. Although the signal is digitized for processing purposes, most of the original information in the signal is retained. However, when the data are finally converted into the far simpler categorical (feature class) data, these are the data evaluated for accuracy. Yet in this conversion, the richness of the data is sharply reduced to a simple binary yes/no test: is a certain feature category to be assigned to a particular cell or not? A feature is assigned to one particular class or to another; but there are no measures ofhow close something is to a true reference value.

It is important to recognize that statistical analyses are extremely limited for categorical (i.e., nominal) data. Only the frequency or proportion of responses in each category can be calculated using methods such as a truth table. As discussed earlier, a truth table is insufficient for evaluating accuracy when used as the sole source. In contrast, a wide variety of powerful and descriptive statistical analysis tools becomes available, when data can be analyzed for variables measured at interval or ratio scales.

These include measures such as mean, standard deviation, variance, correlation, t tests,

105 and/tests. Viewed in this perspective, then, the obvious question emerges: why are

statistical tests only performed on the final qualitative categorical data, and not on the

quantitative measurement data that are provided by the sensor and largely retained

through subsequent processing?

4.5 Illustration of Quantitative vs. Qualitative Statistics

The difference in available statistical measures can be illustrated with the following example. In this example, in order to simplify the analysis we assume that the

“truth” dataset (as represented by the model) has no error, and so functions as the measurand. However, in real life even the “truth” (reference) dataset will have an associated measure of uncertainty. For simplicity in the following analysis, the ideal

“truth” dataset in Figure 4.2b has already been downsampled to the same resolution as the measurement dataset in Figure 4.3b for purposes of comparison.

Figure 4.2a shows a hilltop area. The true elevations of the ground (without trees) are represented in the 5 x 5 gridded Digital Elevation Model (DEM) a shown in Figure

4.2b. The elevations in each grid cell are given relative to a zero of the average mean sea level (AMSL). The cell values in the DEM a in Figure 4.2b are therefore our measurands, designated as such because they represent the true elevations according to the DEM representational framework. The “true” values vary from a minimum elevation of 410 meters at the upper right comer of the grid, to a maximum elevation of 680 meters just above the center. We note that the measurements in a are subject to evaluation according to quantitative statistics.

106 Figure 4.2a. Real World Figure 4.3 a. Reconstructed a World with Error

480 530 580 530 410 470 490 570 530 400

490 570 680 580 450 450 560 690 600 500

590 650 675 550 430 490 590 680 600 510

470 580 590 560 480 450 490 580 550 450

420 440 460 440 430 410 430 470 440 430

Figure 4.2b. Ideal model a of Figure 4.3b. DEM O' as measured world as DEM (measurand) from la (measurand + error)

107 Class Assignments Class Values A 400-499 m B 500-599 m C 600-699 m

ABB B A A A B I B A

ABCBA A B CCB

BC C B A A BC c B

ABBBA A A BBA

AAA A A AAAAA

Figure 4.2c. Model P of Truth Figure 4.3c. Measurements P% classi­ for DEM assigned to 3 classes fied from values in Figure 2b into the (measurand) same 3 classes (measurand + error)

Moving on to Figure 4.2c, the class assignment chart is used to assign the gridded elevations to class A, B, or C. This is now a different representational model, designated as p. Although the measurements are in a different model, they are still “true” relative to the hill of Figure 4.2a. Hence the classes as labeled in Figure 4.2c can be considered as measurands, although they are different measurands than those of the DEM a. Since the values of P are now classes, they are subject to qualitative statistical treatment.

Next, an instrument is used to measure the real world elevations of Figure 4.2a, and values are recorded in the 5 x 5 DEM matrix o ' shown in Figure 4.3b. The instrument records the measurand -4- error. Then the DEM values of cc' in Figure 4.3b are

108 assigned to class A, B, or C, again using the class assignment table. The 3 classes (of

measurand + error) are shown in Figure 4.3c as P'. As before, of is subject to

quantitative statistics and P' to qualitiative statistics.

Next, our task is to associate a degree of confidence with the measurement error

in comparing models a and of (the measurand and the measurement), and P and p ' (a

different measurand and the determined classes). We begin with determining the error in

the first model. From the large set of available quantitative statistical tools, we decide to

use the average error, the standard deviation, and a scatter plot. These are then computed

(for this simplified case) using matched pair differences [16], as follows:

mean of matched pair differences d:

d = ^di/N= X - Ÿ

where: d/ = difference for matched pair /

= Xi-Yi

N = number of matched pairs

evaluating ce' with respect to a the average error is found to be:

d = 9.4 m

(Note that the error of 9.4 m is relatively small compared to the range of values, since this particular statistical measure allows positive and negative pairwise differences to balance each other out to some degree. In fact, the pairwise differences with respect to the average elevation range from -80 meters to +100 meters, as shown in Table 4.1.)

109 The standard deviation of population is given by [17]:

Gd = sqrt [(S(dj - d fy N \

Orf = 38 m

By the definition of standard deviation, this means that about 2/3 of the measured points fall within 38 m of d. The scatterplot shown in Figure 4.4, which plots the measurement number (1 to 25) against the differences di, in Table 4.1 verifies the error distribution. Many other statistics, including spatial statistics, are available to further evaluate the results.

Now we compare P' against the measurand P, using qualitative statistics following the standard truth table method shown in Figure 4.5. The average overall accuracy by area is obtained by adding the diagonal elements (10, 5,2) together. The sum of 17 is then divided by the total number of cells (25), for an average accuracy of

68%. Thus, we know that 68% of the points sampled (all points were used, in this case) match the “true” measurands in model p, but we don’t know how closely they match, or conversely, by how far the unmatched points differed.

We can also note that, whereas in this example all of the cells in P'were compared with the “truth” of P, in typical remote sensing studies only a small portion of the points

(i.e., cells) are sampled, which can introduce significant additional uncertainty. Yet tests to compare the uncertainty of a ‘sampled’ set of points to the uncertainty of an entire scene are not found in the literature. Therefore, the uncertainty introduced by the sampling process is unknown.

110 Xi in Fig. Yi in Fig. d[ 4.2b (L to R'i 4.3b (L to R> (differenced

4 8 0 4 7 0 10 5 3 0 4 9 0 4 0 5 8 0 5 7 0 10 5 3 0 5 3 0 0 4 1 0 4 0 0 10 4 9 0 4 5 0 4 0 5 7 0 5 6 0 10 6 8 0 6 9 0 -1 0 5 8 0 6 0 0 -2 0 4 5 0 5 0 0 -5 0 5 9 0 4 9 0 10 0 6 5 0 5 9 0 6 0 6 7 5 6 8 0 -5 5 5 0 6 0 0 -5 0 4 3 0 5 1 0 -8 0 4 7 0 4 5 0 2 0 5 8 0 4 9 0 9 0 5 9 0 5 8 0 1 0 5 6 0 5 5 0 10 4 8 0 4 5 0 3 0 4 2 0 4 1 0 10 4 4 0 4 3 0 10 4 6 0 4 7 0 -1 0 4 4 0 4 4 0 0 4 3 0 4 3 0 0

Table 4.1. True DEM values measured values Yi, and differences di

111 Figure 4.4 Scatterplot of matched pair differences between “truth” grid cells and “measured” grid cells in Figures 4.2b and 4.3b.

Truth

AB c Total A 10 3 0 13 Measured B 2 5 1 8 C 0 2 2 4 Total 12 10 3 25

Average overall accuracy by area: 68%

Figure 4.5. Error Matrix/Truth Table for Comparing P and P' (Figures 4.2c and 4.3c)

112 In the case of the DEM representation model, only a single scalar quantity for

elevation was permitted in each grid cell. In contrast, in the SAR/InSAR representation

model (prior to classification) each pixel (i.e., grid cell) can carry magnitude, phase, and

polarization, often for two or more frequencies and polarizations. Yet we see that

SAR/InS AR data—a richer data source than the DEM of our example—are reduced to a

few classes in much the same way as the reduction from model o ' to P'. However, if

instead the data were evaluated before classification (and after if desired), quantitative

methods could be used to obtain the uncertainty.

4.6 Other Sources of Uncertainty

Another component of uncertainty that was not discussed above is that any model has error associated with the model itself, since no model can represent the world with complete accuracy. In the case of the DEM ct, the simple act of gridding the terrain and allowing only one value for each grid cell, introduces error. This is because the terrain elevation can vary considerably within a single grid cell, but the model only allows a single value, resulting in a loss of information and detail about the terrain. Then, in the further simplification of assigning three classes to the grid in the o' model, we introduce the possibility of further error and misrepresentation—since if one cell valued at (say)

499 is labeled as “A”, a neighboring cell valued at 500—only 1 meter higher—must be labeled as “B”. It therefore appears (due to the classification rules) that the two cells are of considerably different elevations when that is not the case.

In engineering and science, a standard methodology is often used to develop models and assess their uncertainty. The typical way that uncertainty is expressed is that, based on experimental data, a mathematical model is developed. The model parameters

113 are varied in the mathematical model and results are recorded. The modeled results are then compared with experimental data in which the parameters have been varied a corresponding amount. The degree of agreement between the model and the experimental data tells us how good the model is, if the parameters have been varied to a sufficient degree. If the model and data do not agree to within a desired range, then the model is unsuitable and a new or modified model is sought to better align with the experimental data.

By contrast, in the assessment of accuracy of remote sensing classifications, these types of comparisons have not been made because only the final, categorical data has been evaluated, using qualitative statistics. However, the use of quantitative models, measurements and statistics offers a framework that can follow the standard methodology in science and engineering. For this reason, in the next section we will examine ways to evaluate uncertainty using quantitative measures. The measures will be applied to the steps in SAR/InSAR data processing that lead up to the final classification and corresponding conversion into qualitative data.

Labeling decisions in remote sensing are made largely on the basis of continuous numerical values (associated primarily with spectral and spatial characteristics of the data), yet the final output is categorical (nominal). So, while our final data are categorical, the decisions on how to assign a given class are typically driven by mathematical models.

In this study we will assign the categorical label L to a set of functions [18] comprised of four major component models that are primarily mathematical, as follows:

114 L =/(sensor/scene physical interactions and models)

+/(antenna + process models and algorithms)

+ / (classification models and algorithms)

+ / (labeling models and map output requirements)

Each of the four major component models in turn is comprised of a set of sub­

component models. The fact that most of the models that comprise this process are

mathematical implies that we can quantitatively evaluate the individual and combined

uncertainties of the components and sub-components, before the final categorization

(labeling) process is carried out. Therefore, the uncertainty of the measurand can be

expressed in units that are relevant to its component (and sub-component) functions.

4.7 The New Methodology

In Chapter 3 the concepts of general measurement standards and geospatial data

standards were discussed. Here we primarily apply the general measurement standards,

with some supplementation by the geospatial data standards, and present a new, seven-

step methodology, described below and summarized in Table 4.2.

In Step I, the measurand is defined and described as completely as possible.

(Recall that for our purposes, the measurand is typically the representation of the “true” feature as provided by a particular model, not the feature itself.) Interim measurands may also be needed for complex procedures.

In Step 2, the mathematical model for a given process is expressed as the relationship between the measurand Y and input quantities X„ upon which Y depends.

115 Step 1. Define and describe the measurand as completely as possible.

Step 2. State the mathematical model for a given process as : V =f[Xu.Xa,...)

Step 3. Determine the estimated values xi of the input quantities Xi.

Step 4. Evaluate the standard uncertainty for each input estimate jc,.

Step 5. Determine the combined standard uncertainty jUc-

Step 6. Fully document all steps and processing parameters and report results.

Step 7. Evaluate repeatability and reproducibility of the results.

Table 4.2 Summary of New Methodology

In Step 3 the estimated range of values x, for the input quantities Xi is determined.

Those factors that can be fixed (controlled) within the procedure should be indicated, along with those that cannot.

In Step 4 the standard uncertainty fi(Xi) is evaluated for each input estimate x,-, according to the procedure that was described in Chapter 3. For those parameters that cannot be fixed, the range of uncertainty for the parameters should be varied, experiments conducted, and the resulting uncertainties determined. When it is difficult to carry out experiments, results may be modeled mathematically. The standard uncertainty is evaluated by calculating the experimental standard deviation and the experimental standard deviation of the mean.

116 In Step 5 the combined standard uncertainty Hc is determined for each input estimate x,. The combined standard uncertainty can be found through sensitivity analysis, by evaluating the partial derivatives for each estimate as shown [19]:

t^c(y) = Z f3f/dxiff^(xi) for independent input quantities Xi, and

H^(y) = Z Z (df/dxi ) (df/dxj xj) for correlated input quantities.

In Step 6 all steps and processing parameters are thoroughly documented according to the process described in Extensions for Remote Sensing Metadata in Chapter

3, and all results are reported.

In Step 7 the repeatability and reproducibility of the process is determined by following the prescribed guidelines and calculating the reproducibility variance s r .

Similar to the way in which remote sensing metadata extensions have been built on SDTS and CSDGM, so other fields of science have been building on the principles embodied in GUM and other general measurement standards in order to apply the general principles to their own disciplines. One such expanded measurement standard is described below.

4.8 Adapting Uncertainty Measures in Chemical Analysis to Thematic Mapping

In January 2001 a new standard was released that applies the principles of GUM to chemical measurements [20]. This is a particularly apt standard for our purposes, because there are many parallels between measuring and classifying non-uniform materials in chemistry and non-uniform landscapes in thematic mapping, as well as assessing the associated uncertainty. Possible sources of uncertainty shared by both chemical analysis and landscape mapping are as follows [21]:

117 Sampling: The effect of homogeneity or nonhomogeneity on the measurement

result needs to be determined, along with the effects introduced by a specific sampling

strategy and the physical state of the sample (i.e., either the chemical sample or the

landscape sample), and temperature and atmospheric effects.

Certified Reference Materials (CRMs): in order to establish the fidelity of

measuring systems, reference materials are needed against which equipment (and/or

algorithms?) can be calibrated. CRMs are available for chemical analysis purposes: can

CRMs be developed in remote sensing, using aids such as calibration ranges, data/image

samples, etc.?

Calibration of Instrument: Instruments (and/or algorithms?) should be calibrated

using a certified reference material, in order to establish common, uniform bases for

measurements. The uncertainty of the reference material also needs to be considered,

along with the precision of the instrument.

Analysis: Consider the uncertainty introduced by operator effects and other

systematic errors, instrument/parameter settings, and run-to-run (i.e., image to image)

measurement precision.

Data Processing: Consider the effects of averaging, rounding/truncating figures,

statistics, and processing algorithms including model fitting.

Presentation of Results: In both chemical analysis and thematic mapping, a final estimate of uncertainty should be provided, and in some cases, the confidence level.

The chemical guide also offers practical methods for implementing measurement uncertainty concepts. The aspects of the analysis must be described: the goal, the measurement procedure, the measurand, identification of uncertainty sources,

118 quantification of uncertainty components, the combined standard uncertainty, and a chart

depicting the uncertainty contributions due to different factors for a given process [22].

4.9 Discussion of Calibration Standards and Traceability

The use of calibration standards is part of nearly every determination in chemical

analysis, because modem routine analytical measurements are relative measurements,

which need a reference standard to provide traceability to the SI. The chemical guide

presents a detailed example on how to prepare a calibration standard [24]. By observing

how a calibration standard is prepared for a chemical sample, we may be able to adapt

procedures useful for developing feature or other calibration standards to support the

current topic of study.

The first step in developing a calibration standard is to write down a statement of

what is being measured. The specification includes a description of how the calibration

standard is prepared, as well as the mathematical relationship between the measurand and

the parameters it depends on. The preparation normally follows a Standard Operating

Procedure (SOP). Any contamination must be removed from the reference material,

which should be homogenous and highly purified, following a standard method so that

the sample complies strictly with the specified criteria for the sample. Then the sample is

measured and its concentration determined.

The second step is to identify and analyze uncertainty sources, for each of the

parameters that affect the value of the measurand. The different effects and their

influences can be shown in a cause and effect diagram.

The third step in preparing a calibration standard is to determine the size of each identifîed potential source of uncertainty, which is either measured directly, estimated

119 using previous experimental results, or obtained from theoretical analysis. A repeatability experiment should be performed to duplicate the measurements, to evaluate uncertainties due to variation in the measurement process, and to compare results with a calibration standard. Finally, the fourth step is to calculate the combined standard uncertainty and graph the results.

The concept of varying parameters and observing the effect on the resulting measurement has already been attempted in many SAR/InSAR studies, for certain types of features. It brings up the concept of developing calibration image “clips”, or small samples of images that can be used to compare the results of classification. Some effort has been given to the development of “keys” for standard remote sending studies [25]; however, a theoretical basis has not been developed for the selection of the keys (such as what derines Crop A, Crop B, etc.). Yet these efforts may offer insight into the preparation of calibration standards in our field of study.

In Chapters 5, 6,7, and 8, this methodology will be applied to the four related

SAR/InSAR processes, respectively: sensor/scene interactions; antenna and processor functions; image classification models and algorithms; and map output requirements. A preliminary study should quickly identify the most significant sources of uncertainty, since the value obtained for the combined uncertainty is almost entirely controlled by the major contributions. For this reason, a good estimate of uncertainty should be possible by concentrating effort on the largest contributions.

Then, in Chapter 9 the total uncertainties will be combined using a theoretical model. Our emphasis in Chapters 5 through 9 will be evaluating the potential range of uncertainties due to the dominant factors.

120 4.10 References

[1] See end of Chapter 3, note 1, p. 1.

[2] Robinson, Arthur H., Sale, Randall D., Morrison, Joel L., and Muehrcke, Phillip C., Elements of Cartography. Fifth Edition. New York: John Wiley and Sons, 1984, pp. 109-110.

[3] Genderen, J. L. van, ‘Testing Land-Use Map Accuracy,” Photogrammetric Engineering and Remote Sensing, Vol. 43, No. 9, September 1977, pp. 1135-1137.

[4] Genderen, J. L. van, and Lock, B. P., “A Methodology for Producing Small Scale Rural Land Use Maps in Semi-Arid Developing Counties Using Orbital M.S.S. Imagery,” Final Contractor’s Report—NASA—CR-151173. September 1976, 270 pages.

[5] Hord, R. M., and Brooner, W., “Land-Use Map Accuracy Criteria,” Photogrammetric Engineering and Remote Sensing, Vol. 42, No. 5, May 1976, pp. 671-677.

[6] Zonneveld, I. S., “Aerial Photography, Remote Sensing and Ecology,” ITC Journal, Part 4 , 1974, pp. 553-560.

[7] Rudd, R. D., “Macro Land-Use Mapping with Simulated Space Photographs,” Photogrammetric Engineering, Vol. 37, 1971, pp. 365-372.

[8] Stobbs, A. R., “Some Problems of Measuring Land Use in Underdeveloped Counties: the Land Use Survey of Malawi,” Cartographic Journal, Vol. 5, 1968, pp. 107-110.

[9] Atkinson, Peter M., “Spatial Statistics,” from Spatial Statistics for Remote Sensing. edited by Alread Stein, et al., Dordrecht: Kluwer Academic Publishers, 1999, pp. 60-61.

[10] Ginevan, Michael E., ‘Testing Land-Use Map Accuracy: Another Look,” Photogrammetric Engineering and Remote Sensing, Vol. 45, No. 10, October 1979, pp. 1371-1377.

[11] Fitzpatrick-Lins, Katherine, “Comparison of Sampling Procedures and Data Analysis for a Land-Use and Land-Cover Map,” Photogrammetric Engineering and Remote Sensing, Vol. 47, No. 3, March 1981, pp. 343-351.

[12] Aronoff, Stan, “Classification Accuracy: A User Approach,” Photogrammetric Engineering and Remote Sensing, Vol. 48, No. 8, August 1982, pp. 1299-1307.

121 [13] Aronoff, Stan, “The Map Accuracy Report: A User’s View,” Photogrammetric Engineering and Remote Sensing, Vol. 48, No. 8, August 1982, pp. 1309-1312.

[14] Aronoff, Stan, “The Minimum Accuracy Value as an Index of Classification Accuracy,” Photogrammetric Engineering and Remote Sensing, Vol. 51, No. 1, January 1985, pp. 99-111.

[15] Rosenfield, George H., and Fitzpatrick-Lins, Katherine, “A Coefficient of Agreement as a Measure of Thematic Classification Accuracy,” Photogrammetric Engineering and Remote Sensing, Vol. 52, No. 2, February 1986, pp. 223-227.

[16] McGrew, J. Chapman, Jr., and Monroe, Charles B., An Introduction to Statistical Problem Solving in Geography. Dubuque, Iowa: William C. Brown Publishers, 1993, p. 167.

[17] Ibid. p. 46.

[18] Diamond, William J. Practical Experiment Designs for Engineers and Scientists, Belmont, CA: Lifetime Learning Publications, 1981, pp. 7-8.

[19] International Organization for Standardization, Guide to the Expression of Uncertainty in Measurement. First Edition, Switzerland: ISO, 1995, pp. 19-22.

[20] Eurachem, Guide Ouantifving Uncertainty in Analytical Measurement. St. Gallen: EMPA, 19 Noyember 2000.

[21] Ibid. Appendix C.

[22] Ibid. Appendix Al.

[23] Ibid. Appendix D.

[24] Ibid. Appendix A l.

[25] Kelly, Melissa, Estes, John E., and Knight, Keyin A., “Image Interpretation Keys for Validation of Global Land-Coyer Data Sets,” Photogrammetric Engineering and Remote Sensing, Vol. 65, No. 9, September 1999, pp. 1041-1050.

122 CHAPTERS

UNCERTAINTY IN SENSOR-SCENE INTERACTIONS

This chapter will focus on fundamental interactions between the signal of a

SAR/biSAR sensor and the terrain being imaged. The part of the sensor that will be

emphasized in this chapter is the antenna(s), used to both transmit and receive the

microwave signal. Other portions of the sensor and processors will be discussed in

Chapter 6.

Much has been written about specific sensor-scene interactions, such as the

development of models and measurements for specific terrains consisting of forests,

cropland, snow and ice, and others. However, in our case we will primarily discuss only

the most basic physical interactions, to the first order, along with their corresponding

uncertainties. By studying the uncertainties in the most basic interactions, the

uncertainties in more complex interactions (such as vegetation modeling, which uses the basics as building blocks) may be better understood and inferred to a significant degree.

The 7-step methodology outlined in Chapter 4 for analyzing the uncertainty will be followed to the extent made possible by available sources. The main focus will be a study of uncertainty of the point radar equation as presented in Chapter 1. To this will be added an in-depth study of the principles of backscatter—the most dominant factor of uncertainty in the sensor-scene interaction—and its main contributors of surface

123 roughness, local incidence angle, and dielectric properties. The uncertainties associated with elevation determination in InS AR will also be discussed. Other dominant factors of the signal-scene interaction will be mentioned in order to gain insight on the benefits, limitations, and uncertainties associated with the technology, including polarimetry, coherency, volume scattering and Bragg scattering. Table 5.1 summarizes the factors to be analyzed for uncertainty in this chapter according to the methodology. In many cases a model is given and uncertainties are calculated; however, in some cases only partial information could be located, and in a few cases none, as shown by the blank squares.

5.1 Analysis of Uncertainty of the Radar Point Equation

To review from Chapter 1, the point format of the radar equation can be expressed as follows, in terms of the geometry of the transmitted and received microwave energy and the radar cross section [1]:

P r = PTO[G^k^V[i4nfR*)], where:

Pr = total power received from a point source

P t = transmitted power from antenna

a = radar cross-section: a measure of the reflectivity from a

discrete point scatterer

G = gain: the increase in power of the transmitted antenna

X = wavelength of the signal

R = slant range: distance as measured from the antenna to the

image point on the ground

124 Chapter 5: Overview of Uncertainty Analysis

Step 1. Step 2. Step 3. Step 4. Step 5. Step 6. Step 7. Define Math. Inputs Find Find Report Repeatab. measurand model Xi fk: results & Reprod. 1. Radar point equation • • ••• 2. Radar cross section: a. speckle • •O b. surface roughness • •• • • c. local inci­ dence angle • • ••• d. dielectric properties • • C C o 3. Polarimetry • • C €) o 4. InSAR #• • • o 5. Correlation/ coherence # • c €> o

• Quantity given € Quantity partly given O Numerical value only

Table 5.1. Overview of Uncertainty Analysis in Sensor-Scene Interactions. Not all quantities were able to be located. The degree of completion is indicated by the circles on the chart. Columns 6 and 7 would be completed in a testing environment.

125 1/[(47T)^/2!*)] = scaling factor for the fraction of power returned to the

antenna from the wave front.

This version of the radar equation is limited to monostatic radar, in which the same antenna is used to both transmit and receive the signal.

The first step in the methodology for analyzing the uncertainty is to define the measurand, which is a single scalar quantity. In this case our m easurand is Pr as defined above. The next step is to state the mathematical model in the form of

Y Xa ,...), as was done in the radar equation. Y is the measurand Pr, and/is the function on the right side of the equation. The X/’s are Pj. cr, G, À, and R.

The third step is to determine the estimated values x, of the input quantities X,.

The values of x,- will be used in the fourth step of determining the standard uncertainty.

Note that some quantities are given in decibels (unit in dB = 101ogio(x,)). A conversion chart is shown in Figure 5.1.

DECIBEL CONVERSION

O *20

X = Power Recd/Power Transmit

Figure 5.1. Decibel Conversion Chart 126 According to Zebker et al., the following represent “typical” quantities found in

SAR aircraft measurements [adapted from 2]:

Pt = 30 dBW (decibel-watts)

CT = -1340 dSW

G = 30dB

A = .05 m

R = 10,000 m

Since we do not have an actual set of experimental and repetitive measurements, we will develop only the theoretical uncertainties ^X i) in Step 4, using the given values above. Calculating the partial derivatives of the parameters in the radar equation, we obtain the following.

The uncertainty in the estimate of P r for a given error in P t is:

U pr = G^CTA^/[(4;z)^R*)]# Up t = {P rIP t) • P p t

The uncertainty in the estimate of P r for a given error in cris:

fipR = PiG^X^mTtfR*)]» Ha= WCT).

The uncertainty in the estimate of P r for a given error in G is:

U pr = 2PToG??m nflt)]»M g = ( 2 P r/G) • Mg

The uncertainty in the estimate of P r for a given error in A is:

Mpr = 2ProG^A^[(4;r)V)]. mx= (2P^A ). Mx

The uncertainty in the estimate of P r for a given error in R is:

Mpr = -4PtctG^A^/[(4^)^/?^)]» Mr = (.-4Pr/R) • Mr

127 Assuming that the measurements are independent and the uncertainties are normally distributed, we can combine the uncertainties for Step 5. However, without actual measurements, we are unable to proceed to steps 6 and 7, and thus our analyses throughout Chapters 5-8 will only proceed to Step 5 whenever typical values can be found for both the parameter values and the uncertainties for every parameter in the model. Further, the goal will be to calculate the fractional uncertainty [1] in this case, in order to associate a relative degree of uncertainty with each parameter. Squaring each term in the equation, then adding the terms together and dividing both sides by Pr, we obtain:

(Mcpr/P r) = [(Mpt/P tŸ + + CZfic/Gf + (2nx/^Ÿ + i-4^iR/RŸŸ'^

A typical value of uncertainty for each quantity will be used, chosen in the mid­ range of normal results, according to the respective x/s: ^ pt = 0.1 dBW [3]; 670 dB^m^ [4]; 2f.iG = 6 dB [5]; 2fix == 0 m [6]; and 1-4^r| = 0.8 m \ The units cancel since only ratios are used in the fractional uncertainty. The quantities in the denominator—the values of the x,’s—were given on the prior page. Substituting we obtain:

ficPR/PR = [(0.1/30)^+ (-670/-1340)^+ (6/30f + (0/0.5)^ + (0.8/10000f]'^

= [000001 + 0.25 + .04 + 0 + 0

= [.29]^^ = .54

Thus, the combined fractional uncertainty of Hcpr/Pr is 54%; that is to say, the value of the returned power Pr has a theoretical combined uncertainty of 54%. It can be

128 Figure 5.2. Relative theoretical uncertainties fi(y. Xi) for the radar point equation.

noted that values for three of the terms are so small that they can be neglected, while the only other significant term is related to the gain G. The relative contributions of uncertainty among the five parameters (x/s) are graphed in Figure 5.2.

The uncertainty in G is due to the noise that occurs when the signal strength is amplified, known to average about 3 dB (roughly 10% of G) [5]. The returned signal must be amplified because it is extremely weak. For example, in SEAS AT, the average radiated power was 50 watts, but the effective retum power from a radar cross section of

10 m^ was only about 10'*^ watts [6] ! The receiver amplifies the signal to a useful level, but in the process both the signal and the background noise are amplified.

The uncertainty was conservatively estimated as one-half of the value of the radar cross section cr, since on average there is a 50% probability that a single-look pixel value lies outside the 3 dB range (that is, outside the Vz power range of the returned signal) [4]. This is due to the effect of speckle noise, to be discussed in the following

129 section. The uncertainty value of 50% allowed for oris conservative because it does not include several significant additional contributors to uncertainty, discussed throughout this chapter.

A central principle of the methodology used in this study is that it is most important to quantify and mathematically describe those factors that contribute the greatest uncertainty. It is interesting to note that many volumes of books and articles have been dedicated to quantifying the uncertainty associated with scores of general factors in the SAR/InSAR processes, except for one: the radar cross section cr, explained as follows. Since

Dr. Fawwaz Ulaby, makes the following terse remark about the radar scattering cross- section:

“The factors associated with the scatterer are...difficult to measure individually, and their relative contributions are uninteresting to one wishing to know the size of the received radar signal. Hence they are normally combined into one factor, the radar scattering cross-section.” [7] and later in the same book, Ulaby writes:

“Let us...introduce the notion that the specific radar cross-section (To for a terrain element is appropriately considered as a random variable.” [8]

However, no further explanation is provided, seemingly anywhere! Although an integrated model for the scattering contributions has (apparently) not been developed.

130 sub-models have been developed to some degree for the individual factors, such as sea

ice, pine forest, soil moisture, etc. The matter of how to treat backscatter will be

considered further in the discussion at the end of the chapter.

Dr the following sections theoretical uncertainty models will be developed

according to the methodology, as sources allow.

5.2 Analysis of Uncertainties in the Radar Cross Section

We begin this section by noting that much of the literature dealing with

reflectance properties in an image uses the symbol to relate the inferred target area tr

to a geometrical area dA on the terrain, by the relation:

(To =

The symbol (To is often referred to as the backscattering coefficient, or simply backscatter.

The dominant factor of uncertainty in backscatter is the phenomenon of speckle. Other

contributing factors (for a given radar wavelength and polarization) in descending order

of influence are surface roughness, local incident angle, and material dielectric [9]. Each

of these factors and other minor ones are discussed in some detail from Sections 5.2.1 to

5.2.4.

5.2.1 Analvsis of Uncertaintv in Speckle

For each resolution cell (such as a pixel) we obtain the value of the mean

backscatter (To. However, the size of the resolution cell in SAR is large compared to the wavelength of the signal. Because of this, the resolution cell usually includes many scattering centers (i.e., individual reflections), each on the order of a wavelength, which can be depicted as phasors (Q, I) comprised of length (magnitude) and angle (phase).

The individual phasors combine in a complex linear superposition to produce a resultant

131 phasor, as shown in Figure 5.3. Since SAR is a coherent (monochromatic) signal, constructive and destructive interference from the terrain affects the response. The resultant phasor is returned as the measured backscatter. The result is the mottled, salt- and-pepper appearance in the image known as speckle, shown in Figure 5.4. This is in distinct contrast to optical remote sensing systems that use natural, polychromatic radiation, which has a random phase structure and therefore, the underlying random phases add in power.[10]

We have seen that a single observation of reflectivity at one pixel results in an unreliable estimate of the mean backscatter. A technique that is commonly used to reduce the effect of speckle is called multi-looking. Multilooking is a type of averaging that increases the probability and minimizes the variance. The technique reduces the resolution of the image while smoothing out the variation, by averaging over different sub-images of the same scene. A typical number of looks, iV, is 8; as iVT increases the area of each pixel increases and the variance decreases. When N independent samples are averaged, the estimates are assumed to cluster more closely to the actual value, and the variance decreases. The resulting deviation is the standard deviation of one input sample, divided by [10] Figure 5.5 shows the effect of multilooking in a different scene

(zoomed out considerably as compared to Figure 5.4), which gives a more pleasing and interpretable appearance.

132 phase

Magnitude and direction o f vector sum

Figure 5.3. Complex linear superposition of individual scattering centers produce speckle (after [10])

Figure 5.4. Single-look SAR image, showing evidence of speckle. [i i]

133 Figure 5.5. A SAR image after multi looking (averaging) applied. [12]

The uncertainty- of speckle is modeled as a type of probability function, such as a

Gaussian distribution. The probability of a Gaussian distribution is described as:

p(x) = •exp[-(x-)^/(2/iA:^)]

the mean of x = ; and the variance of x = < x -< x » ^ = [10].

A further discussion about the validity of using a Gaussian distribution is given in the discussion at the end of the chapter.

Our next step would normally be to apply the methodology. However, not enough information is available to describe the measurand. Is the measurand the

134 probability? Where is a mathematical model to link the expression to the backscatter in

terms of K = /(x,)? Without this information, we can proceed no further in an attempt to

quantify the uncertainty of speckle, other than our earlier statement that, for a single-look

image, the uncertainty in the backscatter is 50%.

5.2.2 Analvsis of Uncertaintv in Surface Roughness

In general, the brighmess of the radar backscatter increases with increasing

surface roughness, which is comprised of two obvious components: surface and

roughness. The surface of importance is the boundary between two media that have

significantly different dielectric properties. A fully dielectric material does not conduct

energy and is therefore an insulator, while some materials are partially dielectric. A

dielectric surface can be very different from an optical surface. The dielectric constant

of a vacuum and of air is 1; for germanium 16; and for water 80.4 [13]. Interestingly, the

dielectric constant of dry snow is relatively low, so the reflectance boundary may be an

underlying ice layer, rather than the surface of the snow. In some cases, ancient

riverbeds have been identified many meters below the surface of dry, sandy desert areas.

The depth of penetration is also related to the wavelength, whereby longer wavelengths

penetrate to a greater depth. For example, a short wavelength of 1 cm may scarcely

penetrate the surface at all, whereas a long wavelength of 1 m may penetrate 0.3 m into

wet soil or up to 1.0 m in dry soil [14]. In most cases water, buildings, rocks, etc. provide

a pronounced dielectric surface between the air and the surface, and therefore very little

penetration occurs in these types of features.

If a dielectric surface has average height variations 5h, then the roughness criteria depends on three parameters: the incident angle 0„ the wavelength A, and the average

135 height variations 8h. The angle 0,- is the incidence angle of sensor, relative to the normal of the horizontal surface. From the following relation, the relative phase difference A(p can be found, and the difference in height can be found:

A(p = 4 tc • 8 h /X • COS0,-

This expression provides the criteria for roughness. The relation between the parameters is depicted in Figure 5.6. If the phase difference is small, then the rays from the higher part of the surface tend to add in phase in the specular direction. For a small phase difference, the signal tends toward specular reflection, meaning that the signal reflects in a direction away from the source at an equal but opposite angle. This effect is attributed to the principle of reflection found in Snell’s Law in optics for a smooth surface [9]. The effect on the backscatter is that very little of the signal returns to the receiver, causing a weak backscatter as shown by darker (non-shadowed) areas on an image.

wave phase front

reflecting surface

Figure 5.6. Model of Surface Roughness (after [9])

Referring to the stuface roughness model of Figure 5.6, as the phase difference increases from zero, the rays become more out of phase until the difference approaches Jt.

136 Here, the two waves have opposite energy, so the net energy in the specular direction is

zero. Following the principle of conservation of energy, the signal must be scattered in

other directions, resulting in diffuse reflection that characterizes a rough surface [9].

In SAR measurements, the definition of “rough surface” varies according to

wavelength. A ground feature that may appear rough at a 3 cm wavelength (such as grass

or fine gravel) may appear smooth at a 30 cm wavelength. Not only that, but a rough

surface can appear smooth at a larger angle of incidence 0,-. The relative phase difference

of n/2 is generally recognized as the dividing line between a rough or smooth surface.

This corresponds to a one-way difference in path length of A/8. A surface is therefore

typically defined as smooth if it satisfies the Rayleigh criterion, expressed as [9]:

5h < A/(8cos0/>

Thus, a surface appears smooth if either 5h!X approaches zero, or 0/ approaches 90°.

Generally, as roughness increases the backscatter increases, resulting in a brighter retum.

When the size of the scattering objects approaches the size of the resolution cell, then the brightness is generally best treated as a texture for image classification.

For multi-frequency (i.e., multi-wavelength) radars, the wavelength sensitivity of the terrain may be used to classify the relative scale of surface roughness at a given incident angle. Alternatively, for a single wavelength system, certain roughness scales can often be determined based on a variety of incidence angles. The distinctions of dielectric surface and roughness become blurred in areas such as forests. The extent of penetration depends on the wavelength, and is known as volume scattering. In volume scattering the signal may bounce several times from tree tmnk, to branch, to leaves, and to ground, before returning to the sensor. 137 Although the mathematical model for surface roughness explains how the phase

difference is affected by the terrain, the wavelength, and the angle of incidence, it does

not directly relate these quantities to the observed backscatter. However, in the interest

of analyzing the associated uncertainties, we again apply the uncertainty methodology to

the equation of surface roughness:

A(p = 4tc • Sh/X • COS0,-

The measurand Y is here defined as Aq>, the difference in phase between two

successive signals A and B, and is described mathematically by the above equation.

Since the phase combines with the signal amplitude in producing the image, a change in

phase can make a large difference in the level of returned backscatter. Values for the

three parameters and their uncertainties are: ^h /S h = .23 cm /.95 cm , 8À/X ~ OM, and

Ô9/6 = 5°/45°, based on the mid-range of typical parameters provided by [16,17].

Taking the partial derivatives and substituting for A(p we obtain the following.

The uncertainty in the estimate of Aq> for a given error in 8h is:

= (47i/Aj • COS0, • ^ish = A(p/8h • n§h

The uncertainty in the estimate of A(p for a given error in A is:

= - 4:r • 5h/?? •cos 8, • fxx = - d

The uncertainty in the estimate of A(p for a given error in 0, is:

fJ’Aq) = - 4jt • Sh/X • sin0/ • fiei = - Aqj/cotOi • Hei

Assuming that the parameters are independent, the combined fractional uncertainty is then given by:

^L^yA(p = [ in a /S h f + (ji;,IXf +

138 Substituting the set of typical values given above yields:

^La,/A(P = [(.23 cm t.95 cm)^ + 0 + (5°/45°r]0//1CO\2nI/2 = [ .0586 + 0+ .0123] 1/2 = .266

Therefore, a combined fractional uncertainty of 26.6% is found in the measurement model. Since the uncertainty of the wavelength is known to be ~0 [6], the relative uncertainty of the remaining two factors is shown in Figure 5.7.

Figure 5.7. Relative theoretical uncertainties /i(y, %;) for the surface roughness equation.

The uncertainty in 0,- is presumed to be largely due to the wide angular spread of the signal across the scene, in this case for an airborne SAR. The uncertainty could therefore expect to be reduced for spacebome SAR, since at a higher range the angular spread is reduced. Its effect may also be lessened during subsequent geometric and radiometric processing. However, the uncertainty Sh is largely a function of the ever- varying natural terrain, in which the real uncertainty could be far greater than that used in 139 our example. The example was based on measurements of height differences across a homogeneous terrain of old lava flow with easily-measured properties. However, in more variable terrain this uncertainty would no doubt be far higher. If the model requires that a user must enterdh and ush, then the model may not be that useful in production situations due to a large and unreliable degree of uncertainty.

5.2.3 Analvsis of Uncertaintv of Local Incidence Angle

The geometry and radiometry of a SAR image are influenced by the angle of incident illumination with respect to the local slope of the scene towards the radar. In the prior discussion on the incident angle in section 5.2.2, it was assumed that the underlying terrain was flat. The local incident angle aioc at any point is the angle between the radar line-of-sight to the point in question, and the normal to the tangent plane at that point, as shown in Figure 5.8. While a variation of slope in the azimuth direction (i.e., the forward direction of flight) has only a negligable effect on image brightness, the slope in the elevation plane (i.e., the side-looking direction) has a significant effect. According to [9], “...the local incident angle looms as the largest source of error in the radiometric calibration budget for SAR systems.” Further, van Zyl et al. [96] report that not taking

Hoc

Figure 5.8. Relationship of 6i, (Xioc, and the terrain to the incident signal, (after [9]) 140 topographic variations into account can cause radiometric errors ranging from 1 dB to

more than 10 dB, with the larger errors in areas of high relief; and further, that

polarimetric calibration errors can be introduced if the topography is not considered.

Once again, as with the backscatter, we are faced with the problem of an unknown, naturally-varying terrain. Although SAR-related instrumentation tends to be exhaustively accounted for by way of error models, van Zyl states, “It is therefore important to investigate possible factors, other than calibration techniques and devices, that may limit the achievable calibration accuracy for SAR data.” [96] In his study, he applied a DEM to reduce the amount of radiometric error introduced by the topography; however, the micro-topography effects are not assessed. Since SAR interacts with the terrain at the micro level, one would think that ignoring this effect could introduce considerable uncertainty in the radiometry.

In order to isolate the effect of the surface roughness, the study of the influence of the local incident angle will be limited to cases in which the backscatter o° is relatively independent of the incident angle. This situation occurs when the surface is rough enough to scatter the signal equally in all directions. Then the ratio of pixel brightness

received from a sloping surface to the pixel brightness received from a horizontal surface is:

a : _ sin 61, where

141 — uncalibrated brightness estimate of reflectivity received from

sloping surface

= uncalibrated brightness estimate of reflectivity received from

horizontal surface

0[ = incidence angle of sensor relative to normal of horizontal surface

cxioc= the terrain slope relative to a horizontal surface where the relation between 0;, aioc, and the terrain was shown in Figure 5.8.

The equation tells us that a change in the slope of the terrain (including micro­ terrain, at the scale of the sensor wavelength) can have a large impact on the relative brightness of an image when the incident angle 0, is small. Even when the reflectivity is constant, the ratio Ps °^Ph° increases as the local slope O/ approaches the radar incident angle 0,-. For example, a sensor with 0/ = 23° (such as ERS-1) is twice as sensitive to slope-induced gain as one with 0, = 44° (such as Almaz) [9].

The surface roughness conditions are linked with the backscatter ( f for the local incidence angle, through the general relationships depicted in Figure 5.9 according to rough, less rough, and smooth surfaces. This graph assumes a horizontal reference plane.

It is important to note the very wide statistical variation in the dB level of cf relative to the roughness curves, at all values of local incident angle, as well as the practical lower limit noise. The author suggests that the local incident angle dominates the interaction with the angle of incidence to a first order approximation, however, no mathematical model is offered to relate the two angles. [9]

142 Scattering coefEcieot

+10 Noise equivalent a° (Aâerantm na gam compensation) 0 Statistical variation

-10

-20 ^ Rough

Less rough -30 Coherent Radier smooth

0 30 60 90 Local incident angle, degrees

Figure 5.9. Effect of surface roughness and local incidence angle ccioc on the scattering coefficient cf. Note that these combined influences can causec f to vary by as much as 40 dB, which is a factor of 10,000 on a regular numerical scale in the returned power P t .

Next, the new methodology is used to determine the theoretical uncertainty of the ratio ■ The measurand (Step 1) was described earlier: a ratio of the relative brightness retum that compares the brightness retum at a given pixel (of unknown slope) against the brightness retum of horizontal terrain. It should be noted that this is an uncalibrated measurement, and it raises the question: how is the determination made of

Ph ° ? To have strong confidence in Ph° as a source of comparison, we would need to know that the surface used for Ph° was indeed horizontal; that the material roughness for pH° was comparable to that of Ps°\ and that the dielectric constant was comparable to that of Ps°. To have confidence in the results, we would also need to know the degree of uncertainty associated with each of these measures—but since Ps°/Pn° is only a relative measure, this is unknown. It can therefore be concluded that the measurement ratio

143 Ps°/Pu° (or the use of either of these quantities) is unreliable. In any event, since the equation we are given is in terms of we will proceed with the uncertainty analysis.

For the second step, the equation for the measurand Y in terms of the input quantities Xf, as given earlier in this section is:

_ sine, sin©,_05^>

The third step is to determine the estimated values Xi for the input quantities X-

The incidence angle 0,- can vary from 30° to 85° for a single image in airborne radar, but it varies much less in satellite radar (about 2°) due to the far higher elevation of the satellite platform. To be consistent with the assessment of uncertainty used for surface roughness, we will use the same values as before (in Section 5.2.2) for the incident angle uncertainty and measurement of80/9i = 5°/45°. However, the value of the local incidence angle ocioc and its uncertainty are unknown, except in very tightly controlled situations, due to the changes in slope of the microtopography throughout the area of an image (including such variables as 90° building walls). Therefore, for the uncertainty fiaioc we use the typical limits of imaging, which are 0° < ccioc < 90°. W e thus choose an arbitrary value within this normal range of a/oc of 10° to represent a “typical” pixel slope.

For the fourth step the component uncertainties are determined. Taking the partial differentials we first obtain the uncertainty in the estimate of Ps°/Pn° for a given error in 0, : ^ cos0, - sin(0y) cos(0,- - a ,^)^ "I : sin(0,. - a ,^ ) sin ' (0, - ) 144 Next, we multiply the first term on the right-hand side of the equation by sin 8, /sin0, ,

substituteP s°/Ph° for the corresponding quantities on the right-hand side to obtain:

I CO S0. p / c o s(0 ,- P s /“o, =(cot6>, -cot(0, sin0. sin(0,-a,^)

Next, we determine the uncertainty in the estimate of Ps°/Pn° for a given error in aioc , then substitute Ps°/Pu° for the corresponding quantities on the right-hand side to obtain:

^sin0,. cos(Gi-a,„J^

sin^(d, -a ,^ ) Ph

Then we divide both sides by Ps°/Pn°, square the terms on the right-hand side and take the square root to obtain the combined fractional uncertainty:

^ = ^(cot0, -cot(0, -a,„^)f ^g/+cot-(e. - a ,^ ) fi^ J à _ K

Substituting the values given earlier in the section for 0/, o^oc. % and ôaioc in the above equation, and converting ^6^ and fiaioc to radians gives:

..Ps" = [(2.229)(.00757) + (4.454)(2.465)]'^ = 3.316

Ph"

145 Figure 5.10. Relative theoretical uncertainties /i(y, x/) for the equation involving the local incident angle aioc-

This means that the fractional uncertainty is over 300% from our values! No doubt this can be traced to the uncertainty for the local incidence angle, which is considerably higher than the value of the angle itself. The proportion of fractional uncertainty among the two parameters is shown in Figure 5.10. Once again, we have no way to relate the local incident angle to the backscatter.

5.2.4 Uncertaintv in the Measurement of Dielectric Properties

The dielectric constant is the smallest contributor of the three main factors that affect the backscatter. A dielectric is a material that is neither a perfect conductor nor perfectly transparent to electromagnetic radiation, such as ice, vegetation, or rocks. The electrical properties of materials between these two extremes are described by two quantities: the relative dielectric constant and the loss tangent. The dielectric constant of a material strongly influences the interaction of the electromagnetic radiation with the terrain surface. The dielectric constant is also influenced by the frequency (and therefore to the wavelength, since the frequency/= c/A : the speed of light divided by the wavelength). The main contributor to the dielectric constant in a natural surface is the

146 amount of water present in a material, which as we saw earlier has a very high value of

—80 compared to -1 for vacuum or air. Most dry natural materials have a dielectric constant between 3 to 8 in the microwave region. [18]

For our purposes, perhaps the expression of most interest related to the dielectric constant is that of the loss tangent, which we describe here as the measurand of interest, as given by

tan5 = e'7e'= c/ cûo B , where

e" = the lossy part of the dielectric constant

e ' = the dielectric constant of the material

c = the conductivity of the material

= 2tc • /(th e frequency)

e = the permittivity of the material, which describes its

complex electrical properties according to e = e'- je".

In SAR, the loss tangent is used most often to express the contribution of the dielectric constant. The loss tangent is strongly dependent on frequency, while the dielectric constant by itself only weakly depends on frequency. Table 5.2 shows this dependence.

M aterial s ' Frequency at which c=û>oe

Sea water 81 8.9 X 10® Fresh water 81 2.2 X 10^ Wet soil 10 1.8 X 10^ Dry soil 5 3.6 X 10^

Table 5.2. Dependence of dielectric constant on frequency. [18]

147 Taking the partial derivative, we obtain:

the uncertainty of tan<5 with respect to cOo'.

l^tanS ~ ~cl(fi)o S) • ^cûo

and the uncertainty of 5 with respect to e:

f^ranS ~

then combining the total uncertainties, we obtain

fitanS = \.{cli(Oo^e)ŸHoJ + -

As an example of the relationship between moisture level of a material and dielectric constant. Figure 5.11 shows the relationship between the moisture content of five different soils and the dielectric constant at a frequency of 1.4 GHz (i.e., a wavelength of about 20 cm.). It can be noted that corresponds closely to the water content: a volumetric moisture content of 50% yields a dielectric constant of between 25 and 35.

Extrapolating upwards from the curve, we would expect that as the volumetric soil moisture content continues to increase, so will the dielectric constant until it approaches an e'of 80 (the value for 100% water). Due to the complexity of the mathematical treatment of this subject, the difficulty locating uncertainty data about dielectric constants, and the fact that the dielectric constant is the smallest contributor of the three factors to the backscatter, the uncertainty of measurement will not be considered here.

However, we can state that the dielectric constant varies by a constant factor of as much as 80. As with the other parameters relating to the backscatter, a mathematical model linking the backscatter components to each other could not be found.

148 40 stni TSHi 1 sm E Q«r r- 30 F f« U Xym W ( t | (s) I Sm« sus 35.# 13.S Z lam <2.0 49.5 0.5 3 StlllMM 30.4 S5.9 13.S « Sm iM 17.2 53:0 19.0 25 5 Sttor a«F S.0 47.5 47.4

Froquoncy: 1.4 GHz r-2 3 * C 20 3 5c B 15 ë i 1 0

5 1.4

0.1 0.2 0L3 0.4 0.3 0.6 0 Old 0.1 4 2 4 3 44 4 5 Volumotrtc Moisture m* Volumttrlc Moisture m.

Figure 5.11. The relationship between the measured dielectric constant and volumetric soil moisture at 1.4 GHz [19].

5 3 Analysis of Uncertainty of Other Factors

This section discussees additional factors that influence the radar signal and analyze their uncertainties to the extent possible in a brief treatment. (These factors are not included in the basic radar equation.) Polarimetry, interferometric SAR, coherency, volume scattering, and Bragg scattering are among the factors to be discussed.

5.3.1 Analvsis of Uncertaintv in Polarimetrv

The basic concept of polarimetry was discussed in Chapter 1. A coherent electromagnetic signal has not only a magnitude and phase, but also a polarization in which the signal can propagate in a regular linear or elliptical fashion (see Figures 1.1

149 and 1.2). A polarimetric radar system measures the degree of polarization of the scattered (reflected) wave through a vector measurement process. This process allows every pixel in an image to have a full polarization signal recorded. Numerous studies report an ability to infer detailed information about the geometric structure of specific types of surfaces and of vegetation using polarimetric analysis.

Whereas early radars had only a single polarization, additional polarizations are becoming more common. The maximum polarization information may be obtained through quadrature polarimetry (quad-pol), in which one antenna transmits a vertical signal and the other a horizontal signal. Each antenna receives back the signal, for which its particular orientation can be represented by a vector combination of a horizontal and vertical signal. The different types of polarizations are therefore:

HH horizontal send - horizontal receive (co-polarization)

W vertical send — vertical receive (co-polarization)

HV horizontal send - vertical receive (cross-polarization)

VH vertical send — horizontal receive (cross-polarization)

Upon contact with the terrain surface, the polarization can be changed at the boundary by factors such as surface roughness, object geometry, angle of incidence, wavelength [20], and moisture content of the surface [21]. For practical purposes, the transmitted wave is completely polarized, while the reflected wave is less so. This is because the returned signal is comprised of a superposition of a large number of waves of a variety of polarizations, due to backscattering by a statistically random surface [22].

The magnitude of the radar cross-section depends on the polarization of both transmitting and receiving antennas. The measurements of the polarization signal are represented in

150 the form of a complex (amplitude and phase) scattering matrix for each resolution

element of the radar image. Perhaps the most common form of the scattering matrix is

given as a function of the returned power where:

Pr.c=‘/2[ST[M][S*],

for which P r e c is a 1 X 1 scalar;

[ST is a 4 X 1 Stokes vector of the reflected signal, transposed to a

1x4 matrix;

[M] is a 4 X 4 matrix, which is calculated; and

[ST is a 1 X 4 Stokes vector of the transmitted signal,

where the Stokes vector is given by the following expression:

Ufj 2 2 Oh Oy lOffOy cosS laf/aySinS

In this expression, the parameters uh and ay correspond to the horizontal and vertical amplitudes of the two polarized wave components, and 8 is the phase difference between them. [23]

Taking the partial derivatives, we obtain:

The uncertainty of [S] with respect to an'.

2a„ I uh

2 o y COS 5 2oy sin 8

151 The uncertainty of [S] with respect to a„

20y — 2ût„ 2afj cos 5 2afj sin 8

The uncertainty of [S] with respect to 5:

0 0 — sin 8 2a„ay cos8

And finally, the combined uncertainty of:

{2a„ Ÿ • + {2ay)~ • {2a„ f • + {2 a yf •

(2aySin +(2a^cosJ)-;t^y- +(2a^Oy sinJ)-^/ {2ay s \ n 8 f ^ ^ ' + {2a„ s m 8 f + (2a„a^, co%8f Hg-

The polarization “signature” of a scatterer is a plot of the power of the retum wave as a function of transmit and receive polarizations. The term “signature” is used loosely, since the polarization response is not unique, and different combinations of scattering mechanisms can produce the same polarization response [22]. Figure 5.12 shows the polarization “signature” response for a grass surface measured at L-band (30 cm) at an incident angle of 30°. The figure shows the relative intensity of the like- polarized and cross-polarized components as a function of the angular parameters of the polarization ellipse.

152 (c) (a)

Figure 5.12. The figure on the left is a cross-polarized response, while the figure on the right is a co-polarized response for a grass surface at 9= 30°. [24]

A further error for satellite-based polarimetry is due to Faraday rotation, which occurs when a wave passes through an electromagnetically-active medium. Ionized particles at high altitudes of the Earth’s atmosphere can cause the polarization vector to undergo a progressive rotation, whereby it can lose its H-V information. The effect is stronger at longer wavelengths: it is not significant at C-band (~4 cm), but can be a problem at L-band (-40 cm).

Although many studies report the ability to determine landscape features from polarimetry data [25-31] the only information that could be found regarding associated uncertainty with representative data was found in [32]. In this article, Joughlin et al. estimate the precision of polarimetric-signature returns as a function of the number of

SAR looks and other estimates. They discuss how polarimetric signatures often exhibit very substantial variances, which follow the general nature of the speckle phenomenon for coherent data. For this reason, the authors state that it is often necessary to average polarimetry data in order to obtain “estimates of useful precision.” A probability density

153 function (PDF) is used to model the polarimetric signal response, based on an assumption

of a Gaussian distribution. [A PDF is a particular type of non-negative, continuous-

variable function that integrates to 1 over its domain, with the PDF of X given in the

interval (a,b) as P(a).] Polarimetric signatures are often analyzed according to the

copolar ratio rw/HH = |Swp/||SHHf , the crosspolar ratio Th v / h h = |ShvP/HShhP , and the

copolar phase p = /[<(Shh1^x|Sw|^]*^- The details of the derivations are

beyond our scope here, however, the following results are reported in terms of their

obtained uncertainties for the copolar ratio:

Looks Theory Simu ation N Pr fir 1 - 9.802 3867.0 4 2.240 1.187 2.244 1.187 16 2.048 0.452 ^ 2.048 0.452

Thus, we see the uncertainties for the copolar ratio as unacceptably high for a single-look image; very large (>50% of the measurement ratio) for a 4-look image; and a moderate uncertainty (-22% of the measurement ratio) for a 16-look image.

Similarly, for the copolar phase, the following values are reported:

Looks Theory Simulation N < P 0 «l>W/HH> tlo 1 0 52.54° 0.06° 52.43° 4 0 19.37° 0.00° 19.48° 16 0 7.91° 0.00° 7.91°

These results for the copolar phase are difficult to interpret, but seem to indicate that the uncertainties in the phase measurement ratios are a great deal larger than the value of the ratio measurements themselves—even when the image is averaged to 16-looks (1/16 of the original resolution). 154 5.3.2 Analysis of Uncertainty of Scene-Sensor Response in Interferometric SAR

InSAR principles were briefly introduced in Chapter 1. Here our main interest is

the uncertainty in elevation for spacebome and airborne platforms. The main sources of

uncertainty are discussed below. Values could not be located for error associated with all

of the equation parameters; therefore, this section will rely in part on the use of graphs

and charts to explain the height uncertainties (in addition to the mathematical models).

According to [33], the uncertainties in elevation can be quantified using the

following equation, based on a single pixel measurement [33]:

Hh = [ (A^12)cos^0+ {(.Ae/iSNRŸ'^f + {0.6W/Rf}Rh\n^eŸ'^

where

= height measurement accuracy, perpendicular to the map plane

A = radar range resolution = d2B, where B is the signal bandwidth

0 = line of sight angle

R = range from interferometer to clutter patch

SNR= signal-to-noise ratio (a full discussion of this is found in [33])

W = effective cross-range extent of the clutter patch in the range

resolution cell

Two major contributors to the height error are depicted in Figure 5.13, showing the optimum resolution relative to terrain slope, at which a minimum height error can be expected.

155 2 9 20 RMS Height IS Error, 'M .m 10

•20

0 12 20 Range ResokJtion. tn IFSAR height measurement accuracy, stripmap mode.

20

GMund Siopo, dog RMS 12 Height Error, ®M.»n

4

0 0 4 0 12 20 Range Resolution, m IFSAR height measurement accuracy, spotlight mode.

Figure 5.13. Optimum resolution for minimum height uncertainty, depending on the slope of the ground. [33]

Typically, studies reported in the literature have already corrected much of their height uncertainties using DEMs and other control methods. However, without such corrections, up to a magnitude greater uncertainty in height can be expected. If extrapolations can reasonably be made from the following list [33], a generalization is that the corrected height error can be roughly one-half of the original pixel resolution:

Ground Original Corrected Sensor Tvpe ______Range Resolution Height Error Spacebome repeat-pass 300 km 25 m 10 m Spacebome ERS-1 repeat-pass 790 km 25 m 10 m Airborne TOPS AR bistatic —10 km 4 m 2 m 156 5.3.2.1 Additional Uncertainties in Airborne SAR/InSAR

Airborne InSAR typically relies on the simultaneous use of two antennas, mounted to the fuselage and separated by a baseline between them. In such cases, the elevation uncertainties tend to be significantly less than in the repeat-pass case, both on airborne and satellite platforms. This is due both to a greater certainty in the baseline because the inter-antenna distance is fixed, and to the absence of decorrelation effects.

The main sources of vertical (as well as horizontal) positional uncertainties for dual­ antenna airborne platforms are shown in Table 5.3. Many of these will be discussed in

Chapter 6, since they are influenced by SAR processor functions.

VERTICAL SOURCES OF UNCERTAINTY

Vertical Errors Sources Azimuth tilt Vertical velocity bias Range tilt Attitude biases Baseline orientation Absolute phase ambiguity Vertical offset Nav. system position Correlated height error Mocomp=Nav. + processor Multi-path High frequency random Signal-to-noise ratio Impulse response Channel co-registration

HORIZONTAL SOURCES OF UNCERTAINTY

Positional Errors Sources Azimuth scale Velocity bias (nav. system) Range scale Baseline length Absolute phase ambiguity Slant range calibration Skew Velocity bias, processor Rubber sheet distortion Mocomp =Nav. + processor High frequency across-track Signal-to-noise ratio Impulse response Channel

Table 5.3. Sources of vertical and horizontal uncertainty in the dual-antenna airborne case. [34] 157 S.3.2.2 Additional Uncertainties for Repeat-Pass Satellite InSAR

Most literature on satellite-based InSAR relies on the use of two scenes of the

same area, imaged during different orbits. The altitude is inferred from measurements of the returned phases. The uncertainty in the phase measurement (and hence the uncertainty in the elevation) is caused by several factors:

(1) The SAR system itself can generate uncertainties due to clutter-to-noise ratio,

the number of looks, pixel misregistration, and baseline decorrelation.

(2) Topography measurements increase in uncertainty as the baseline increases,

and as the slopes steepen. The phase becomes either ambiguous or

completely when the terrain slope is so high that layover or shadowing

occurs. [35]

The uncertainty in height as a function of phase error is given by [36]:

fiz=[À.pf4np\ • [sin0/cos(0-«)] •

where

X = the wavelength

p = the slant range

P = the baseline

0 = the look angle

a = the interferometer baseline orientation with respect to the

horizontal

Pg, = the phase error in the interferogram, and

Pz = the resultant height error.

158 200

00 S 1 5 0

40 100 en

&. 20 5 0

990 1000 1010 1020 1030 1040 0 25 50 75 100 PRESSURE. MB RELATIVE HUMEXTY. PERCENT

Figure 5.14. Atmospheric humidity and pressure effects on the phase angle. [36]

Spacebome observations are also subject to time and space variations in

atmospheric water vapor, and to a lesser extent, air pressure and temperature.

Atmospheric variations alone between orbits can produce up to 100 meters of elevation error in worst conditions. The effects of atmospheric humidity and pressure on the phase are shown in Figure 5.14. This problem can be minimized by recording the relative humidities at the time of imaging, and selecting two sets of phase data from images with similar (preferably low) humidity levels. This is one of the reasons why dual-antenna, single-pass satellite (and space shuttle) configurations are strongly preferred, when accurate elevations are sought. The other reason is decorrelation, which is the limiting error source in repeat-pass InSAR [2], discussed in section 5.3.3.

5.3.2.3 Further Considerations in InSAR

A thorough discussion of elevation uncertainties is documented by a number of authors and so will not be repeated here [see references at end of chapter]. However, it can be noted that the choice of bandwidth, baseline, and antenna length, along with

159 terrain parameters of backscatter, surface roughness, and slope can each influence the uncertainty in height by several meters. Further, in very steep terrain the slope effect alone can result in elevation errors in excess of 25 m. [37]

Mountainous areas and areas with rapidly-changing heights tend to have a significantly higher uncertainty in elevation measurements than flatlands or gently- undulating terrain. Except for foreshortening and shadow effects, most systematic errors for both airborne and space platforms can be corrected by the use of ground control points and other methods. When ground control is not available, tilts, offsets, and scaling of data can contribute significantly towards total error.

Further, it can be observed that the uncertainty of the ground truth is rarely mentioned. Thus, in some locations it is possible that the ground “truth” (such as a

DEM) is in greater error than the InSAR-determined elevations. Such comparisons should be made with care, so that a strong confidence of the uncertainty of a given

InSAR dataset can be determined.

InSAR datasets can provide remarkably detailed elevation maps, and are becoming useful for mapping remote and/or cloud-covered regions that have been difficult to map in the past, such as Antarctica, Malaysia, equatorial areas, etc. However, because of the factors mentioned above, a careful assessment should be made of the factors used in creating and checking the data. Otherwise, the results could yield significantly high errors, both in horizontal and vertical positioning.

160 5.3.3 Analysis of uncertainty of correlation data in repeat-pass interferometrv

Correlation is the ability to match phases from two scenes of the same area.

Decorrelation occurs when phases cannot be matched, and is an effect that often occurs in repeat-pass interferometry. There are 3 sources of decorrelation [38]:

- spatial baseline decorrelation from nonidentical viewing angles;

- rotation of the target between observations; and

- surface movement between observations.

Thus the total observed correlation can be represented as

P to ta l — Ptemporal Pspatial + P therm al.

A lengthy formulation of the baseline correlation coefficient can be found in [39] as a function of the SAR system parameters, the baseline, the local (mean) surface slope, and the surface statistics.

Coherence, a related effect, is a measure of the average value of the phase difference for small areas within an interferogram. (An interferogram shows relative elevations as a function of the phase differences and looks similar to a detailed contour map.) The tolerance for phase agreement is about A/8. The coherence effect, which measures the average value of the phase difference (if the difference is not too great) over small areas of the interferogram, can be a powerful tool for land surface classification and change detection. But when coherence exceeds a certain amount, incoherency (e.g. excessive signal corruption) results, making these areas unsuitable for determining elevation. A coherence image is generated from the following relation [40]:

161 C = < Xi + X2* > / ( < \X if > • <1X2!^)'^ , where

Xi and X 2 = the complex images

< > = the spatial average operator

* = the complex conjugate operator, and

A C = the coherence estimate

Farr, et al. summarize the difficulties as follows: “...about 30-50% of the land

surface is probably unsuitable for measuring topography with repeat pass

interferometr>'.” [41] Further, H. Lee and J. Liu [42] state that the dominant

decorrelation factor is the relief of the terrain, which for a side-looking system is “so

overwhelming, particularly on a foreshortened or layover slope, that the coherence drops

down dramatically toward zero.” The low coherence effect from the terrain is easily

confused with temporal decorrelation. They report a lessening of the effect with ratio

coherence imagery, but caution that “weather data, geological information, and adequate

temporal and baseline separations between SAR data sets are necessary for correct

interpretation of the ratio coherence imagery.” Finally, Gray and Farris-Manning [43]

report that coherence between passes for different terrains (e.g., grass and trees) can vary

and appears to depend on wind conditions.

5.3.4 Description of Other Effects

This section describes a number of effects that occur in the sensor-scene

interaction. The mathematical models will not be developed in this section in terms of

uncertainty, as their effects can generally be derived from those already discussed in this chapter. 162 Volume scattering occurs when the wave penetrates the volume of the reflecting

medium, as often happens in forested areas. A signal may bounce from a branch, to a

trunk, then to the ground, for example, before returning to the receiver. In this case, a

more complicated signal is returned that includes both surface and volume effects.

Analysis can also be made difficult by standing water or moist soil conditions within the

volume scatter. Moreover, volume scatter from vegetation in a flat area will result in a

different return than similar vegetation located on a slope.

The following model was developed to correct the effect of underlying slope on

the volume backscatter return from an area [44]:

(p = arccos[cososinJ7 - sinocost;cos[Æzz)]

where

is the local slope; rj is the incidence angle with the geoid; and Aaz is the difference

between the azimuth to the satellite and azimuth of the surface normal. However, the model relies on a detailed DEM and accurate registration of the SAR imagery.

Bragg scattering, shown in Figure 5.15, is a resonance effect that sometimes occurs in rough features. If the scatterers’ structures are aligned along the phase fronts and regularly spaced in range, then the reflections build on each other. The effect is particularly noticeable for ocean waves, and results in the appearance of very bright lines spaced along the phase fronts. The Bragg condition occurs when Skb = «A/2sin0„ with the relationships as shown in Figure 5.15.

163 Slant Range

m

Figure 5.15. Bragg scattering occurs when features are aligned along the phase front of the signal.

Dihedral and trihedral reflectors are often used as control points on the ground. A dihedral reflector connects two flat sides at a 90° angle, while a trihedral reflector connects at three sides (see Figure 5.16). When positioned using GPS or other survey measure, and properly angled to receive the SAR signal relative to its incident angle 9, an extremely bright point is generated on the SAR image and can be easily seen.

However, certain ground features can also act as reflectors. A tree at the edge of a calm lake can make a dihedral angle, resulting in a bright point on the image. The right angle between buildings and pavement or the ground can act as dihedral reflectors, causing bright lines and points, a common effect in urban areas. Defense intelligence uses dihedral and trihedral characteristics in SAR images to identify tanks, planes, and other military haidware on the ground with angular features.

164 Figure 5.16. Trihedral reflector, used along with dihedral reflectors as ground control points. [45]

A specular point occurs when the slope of the terrain is perpendicular to the

incidence angle. When this happens, a bright return is directed back at the radar. For

example, mountain images may show bright returns from crests and ridges that have

facets directly facing the radar signal. This is one reason why the side-looking imaging

of SAR is preferred over nadir (i.e., down-looking) imaging: very bright specular returns

tend to overpower the center area of the image.

Finally, the cardinal effect occurs when features of wavelength size extend in the azimuth direction, parallel to the line of flight. If the average geometric structure is aligned with the phase fronts of the illumination, then the reflections are correlated in phase and present a bright return. The cardinal effect increases at longer wavelengths.

Thus we have seen that the characteristics of the SAR signal can result in desirable, undesirable, or mixed effects in an image, depending on the purpose for which it is intended. Clearly, there is a need to develop a comprehensive model to relate the

165 main contributors of backscatter to each other and to the radar equation, perhaps in the

form:

Ototal ~ ^speckle "t" O’njughness ^aloc ■*" O^volume Mother.

Where CTtotai is as given in the standard radar equation, and each cr, has a corresponding

mathematical model, or combined models as appropriate.

5.4 Summary of Uncertainties in Sensor-Scene Interactions

The uncertainties for each element that we have examined are summarized below

in Table 5.4, along with their corresponding mathematical models. It can be readily

noted that by fa r the dominant uncertainties are due to wide variations in the terrain and

features being imaged.

Type of Major Defîned Mathematical Model Dominant Uncertainty M easurand Uncertainty Radar point equation Returned power Pr = PTOG^^^/KArTtfpf] 85% due to a Radar cross section )2^2fix)2j probability b. Surface roughness Small changes in A(p = A7tiShf?C)cos6i 82% due to surface height Sh c. Local incidence Relative pixel Ps /Ph = sin0,/[sin(0r -100% due angle brightness ratios to CCioc d. Dielectric Properties Loss tangent tan<5'= e'1s'= Constant factor of 1-80 Interferometry Elevations /t/f = [ (d^/12)cos^0 + Min. Vz pixel {{A&iSNRŸ^Ÿ + width; max. iQ.6W/Rf}F^s\n^GŸ'^ 25-100 m

Polarimetry Copolar ratio (CR) CR = < r w / H H > >50% ofC R

Table 5.4. Summary of Uncertainties in Sensor-Scene Interactions. The responses are heavily dominated by variations in the landscape.

166 5.5 Note and References

Note

1 Per conversation with Dr. Dorota Grejner-Brzezinska, GPS scientist, at The Ohio State University Center for Mapping on July 11, 2001. From her extensive theoretical and practical work with orbital and airborne GPS positioning accuracies, given reasonable assumptions of baseline, differential corrections with base stations, and other standard corrrections, height uncertainties are generally less than 10 cm in the airborne case and less than 30 cm in the orbital case. Reduced error can be expected when the baseline, pointing accuracy, and other measures are controlled to a greater degree of accuracy.

References

[1] Wessei, Paul, “Uncertainty in Derived Quantities,” URL http://www.soest.hawaii.edu/wessel/courses/gg313/DA book/node23.htm. created 12/7/2000 and accessed 7/1/2001, p. 4 of 7.

[2] Zebker, H., van Zyl, J., Elachi, C., “Polarimetric Radar System Design,” from Radar Polarimetrv for Geoscience Applications. Ulaby, F. and Elachi, C., Editors, Norwood, Massachusetts: Artech House, pp. 273-280.

[3] Curlander, John C. and McDonough, Robert N., Svnthetic Aperture Radar: Systems and Signal Processing. New York: John Wiley & Sons, Inc., 1991, pp. 72-94.

[4] Ibid., p. 324.

[5] Raney, Keith R., “Radar Fundamentals: Technical Perspective,” from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2, Henderson, Floyd M. and Lewis, Anthony J., Editors, New York: John Wiley & Sons, hic., 1998, pp. 20-21.

[6] ------, Svnthetic Aperture Radar: Systems and Signal Processing. New York: John Wiley & Sons, Inc., p. 326.

[7] Ulaby, F T., Moore, R.T., and Fung, A.K., Microwave Remote Sensing. Active and Passive. Volume II: Radar Remote Sensing and Surface Scattering and Emission Theory. Reading, Massachusetts: Addison-Wesley Publishing Company, 1982, p. 460.

[8] Ibid.. p. 476.

[9] , from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. Henderson, Floyd M. and Lewis, Anthony J., Editors, New York: John Wiley & Sons, Inc., 1998, pp. 32-46. 167 [10] I ^ . , pp. 67-82.

[11] URL http://www.alaska.edu/user serv/amplitude.html. example from ASF ERS-1 SAR image # 8154400, accessed 7/12/01.

[12] URL http://www.alaska.edu/daac documents/cdrom images/60769200.gif. accessed 7/12/01.

[13] Sears, P., Zemansky, M., and Young, H., University Physics. Sixth Edition. Reading, Massachusetts: Addison-Wesley Publishing Company, 1982, p. 523.

[14] Lewis, Anthony L, “Geomorphic and Hydrologie Applications of Active Micro­ wave Remote Sensing,” from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. Henderson, Floyd M. and Lewis, Anthony J., Editors, New York: John Wiley & Sons, Inc., 1998, pp. 616-617.

[15] Sheen, D R. and Johnston, LJ*., “Statistical and Spatial Properties of Forest Clutter Measured with Polarimetric SAR,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, No. 3, May 1992, pp. 578-588.

[16] van Zyl, J., Burnette, C. and Farr, T., “Interference of Surface Power Spectra from Inversion of Multifrequency Polarimetric Radar Data,” Geophysical Research Letters, Vol. 18, No. 9, September 1991, pp. 1787-1790.

[17] ------, Svnthetic Aperture Radar: Systems and Signal Processing. New York: John Wiley & Sons, Inc., p. 326.

[18] Lewis, A.J. and Henderson, F.M., “Radar Fundamentals: The Geoscience Perspective,” from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. Henderson, Floyd M. and Lewis, Anthony J., Editors, New York: John Wiley & Sons, Inc., 1998, pp. 161-162.

[19] Dobson, M. C. and Ulaby, F. T., “Mapping Soil Moisture Distribution with Imaging Radar,” Ibid.. p. 411.

[20] Tomlinson, R.F.. Geographical Data Handling. Volume 1: Environment Information Systems. Symposium Edition, UNESCO/IGU Second Symposium on Geographic Information Systems, Ottawa, August 1972, p. 291.

[21] Boemer, W., Mott, H., Luneburg, C., et al., “Polarimetry in Radar Remote Sensing: Basic and Applied Concepts,” from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. Henderson, Floyd M. and Lewis, Anthony J., Editors, N.Y.: John Wiley & Sons, Inc., 1998, pp. 271-341.

[22] Ulaby, F.T. and Elachi, C., Radar Polarimetrv for Geoscience Applications. Norwood, MA: Artech House, Inc., 1990, p. 33. 168 [23] Raney, Keith R., “Radar Fundamentals: Technical Perspective,” from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2, Henderson, Floyd M. and Lewis, Anthony J., Editors, New York: John Wiley & Sons, Inc., 1998, pp. 118-119.

[24 ] ------, Radar Polarimetrv for Geoscience Applications. Norwood, MA: Artech House, Inc., 1990, p. 258.

[25] Sheen, Dan R., “Statistical and Spatial Properties of Forest Clutter Measured with Polarimetric Synthetic Aperture Radar,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, No. 3, May 1992, pp. 578-588.

[26] Lim, H.H., Swartz, A.A., et al., “Classification of Earth Terrain using Polarimetric Synthetic Aperture Radar Images,” Journal o f Geophysical Research, Vol. 94, No. B6, June 10, 1989, pp. 7049-7057.

[27] EQrosawa, Haruto, “Degree of Polarization of Radar Backscatters from a Mixed Target,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 2, March 1997, pp. 466-470.

[28] Cloude, S.R. and Pottier, E., “An Entropy Based Classification Scheme for Land Applications of Polarimetric SAR,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 68-76.

[29] Lee, J.S., Hoppel, K.W., et al., “Intensity and Phase Statistics of Multilook Polarimetric and Interferometric SAR Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 5, September 1994, pp. 1017-1028.

[30] van Zyl, J.J., Burnette, C F., and Farr, T.G., “Inference of Surface Power Spectra from Inversion of Multifrequency Polarimetric Radar Data,” Geophysical Research Letters, Vol. 18, No. 9, September 1991, pp. 1787-1790.

[31] Freeman, A. and Durden, S.L., “A Three-Component Scattering Model for Polarimetric SAR Data,” a report of the Jet Propulsion Laboratory, California Institute of Technology, 30 pages.

[32] Joughin, Ian R., Winebrenner, Dale P., and Percival, Donald B., “Probability Density Functions for Multilook Polarimetric Signatures,”IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 3, May 1994, pp. 562-574.

[33] Mrstik, V., VanBlaricum, G., Cardillo, G., and Fennell, M., ‘Terrain Height Measurement Accuracy of Interferometric Synthetic Aperture Radars,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 34, No. 1, January 1996, pp. 219-228.

169 [34] Madsen, Soren N. and Zebker, Howard A., “Imaging Radar Interferometry,” from Principles & Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. Henderson, Floyd M. and Lewis, Anthony J., Editors, New York: John Wiley & Sons, Inc., 1998, pp. 359-380.

[35] Hagberg, J.O. and Ulander, L.M., “On the Optimization of Interferometric SAR for Topographic Mapping,” 1KEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 1, January 1993, pp. 303-306.

[36] Zebker, H.A., Rosen, P.A., and Hensley, S., “Atmospheric Effects in Interferometric Synthetic Aperture Radar Surface Deformation and Topographic Maps,” Journal of Geophysical Research, Vol. 102, Issue B4,4/10/97, pp. 7547- 7563. [37] Rodriguez, E. and Martin, J.M., “Theory and Design of Interferometric Synthetic Aperture Radars,” IEEE Proceedings — F, Vol. 139, Number 2, April 1992, pp. 147- 159.

[38] Zebker, H.A. and Villasenor, J., “Decorrelation in Interferometric Radar Echoes,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, No. 5, September 1992, pp. 950-959.

[39] Franceschetti, G. and lodice. A., ‘The Effect of Surface Scattering on IFSAR Baseline Decorrelation,” Journal of Electromagnetic Waves and Applications, Vol. 11, 1997, pp. 353-370.

[40] Coulson, S.N., “SAR Interferometry with ERS,” Earth Space Review, Vol. 5, No. 1, 1996, pp. 9-16.

[41] Farr, Tom, et al., “Mission in the Works Promises Precise Global Topographic Data,” EOS Transactions, American Geophysical Union, Vol. 76, No. 22, May 3, 1995.

[42] Lee, H. and Liu, J. G., “Analysis of Topographic Decorrelation in SAR Interferometry using Ratio Coherence Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 2, February 2001, pp. 223-231.

[43] Gary, L. and Farris-Manning, P., “Repeat-Pass Interferometry with Airborne Synthetic Aperture Radar,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 1, January 1993, pp. 180-191.

[44] Pairman, D., Beiliss, S.E., and McNeill, S.J., ‘Terrain Influences on SAR Backscatter around Mt. Taranaki, New Zealand,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 4, July 1997, pp. 924-932.

[45] URL http://www.asf.alaska.edu/calval/old cr.gif, accessed 7/12/01.

170 CHAPTER 6

UNCERTAINTY IN ANTENNA/RECEIVER AND PROCESSOR FUNCTIONS

A SAR system is here defined as being comprised of one or two antennas and receivers, data storage and transmission components, and a series of signal processing steps. In order to form an image from the signal data, hundreds of mathematical opera­ tions must be performed on each data sample [1]. This chapter will examine the major uncertainties inherent in the antenna/receiver and processor functions, related to noise, range and Doppler ambiguities, radiometry and geometry, other factors, and InSAR.

The elements of uncertainty in a SAR system are well-documented and extensively modeled in [2]. This chapter discusses only the dominant contributors to uncertainty, along with typical values of parameters where available. The methodology is followed to the extent necessary to demonstrate its application to uncertainties in a

SAR system, and is by no means comprehensive. As in Chapter 5, when either equations, typical parameter values, or uncertainties cannot be located, the application of the methodology is only partial. The combined total uncertainty at the end of the chapter should be taken as a rough estimate only. Table 6.1 summarizes the aspects of uncertainty in antenna/receiver and processor functions, along with their degree of coverage in this chapter according to available sources. A description of the major elements follows. 171 Chapter 6: Overview of Uncertainty Analysis

Step 1. Step 2. Step 3. Step 4. Step 5. Step 6. Step 7. Define Math. Inputs Find Find Report Repeatab. measurand model Xi p(xiYs fie results & Reprod. 1. Receiver noise • •• • • 2. Doppler ambiguities •• C C 3. Range ambiguities • • CC 4. Radiometry •• c c O 5. Geometry see Doppler and • • range ambiguities 6. Range curvature •• 7. InSAR • • see Section 5.3 .2

• Quantity given C Quantity partly given O Numerical value only

Table 6.1. Overview of Uncertainty Analysis in Antenna/Receiver and Processor Functions. Not all quantities were able to be located. The degree of completion is indicated by the circles on the chart. Columns 6 and 7 would be completed in a testing environment.

172 6.1 Uncertainty due to Receiver Noise

The main uncertainty in the antenna/receiver part of the system comes from the fact that the signal is so dispersed that, by the time it is received back at the sensor, it is a tiny fraction of its original strength. The signal is then amplified in order to detect a response, but in doing so the background noise is also amplified. This background noise can be attributed to internal noise, thermal (“white Gaussian”) noise, nonlinear effects, and other effects. The receiver noise temperature components can be grouped into a single parameter, the operating noise factor Fop, and calculated according to individual sensor design constraints, using the relation [3]:

Fop = {Mint + GakTsVGakT, = (To + where N[nt = = the receiver’s internal noise, Ga is the gain, k is Boltzmann’s constant (1.38 x 10'^ J/K), and Ts is the temperature of the source noise. As noted in the above equation, Ga and k cancel out, leaving the expression on the right. Then, using typical values of 7^= 640 K and = 248 K, the operating noise factor Fop = 3.58. [3]

This can also be expressed as the value of the noise figure in dB using the relation lOlogFop = 5.5 dB.

The measurand in this case is the operating noise factor, and the equations for determining the associated uncertainties are:

F-Fop/Te ~ 1/Ts • FTe

and

FFop/Ts — -Tg/Tg • firfs

In general, noise tends to be spread out uniformly in frequency and is Gaussian in amplitude as a function of time [4]. By contrast the radar signal has a particular spectral

173 makeup according to the transmitted pulse. The same spectral makeup is used to create a

matched filter, which is identical in form to the transmitted pulse. This filter is then

applied to the return signal, and by “matching” the transmitted and returned waveforms

much of the unwanted noise is removed. However, some noise remains that generates an

uncertainty in the range measurement R. The uncertainty has been calculated according

to the following equation in terms of the measurand R [5]:

Sr = c/[4B(SNR)'^]

where c is the speed of light, B is the bandwidth, and SNR is the signal-to-noise ratio. B

is defined as a measure of the span of frequencies (or the frequency limiting stages)

available in the signal. Typical bandwidths are on the order of 20 MHz in range and 1

KHz in azimuth; the bandwidth determines the ultimate resolution available for the image

[6]. SNR is defined as the ratio of the root-mean-square (RMS) signal power to the RMS

noise power at the output of a radar receiver [7].

For example, a SNR of 0 dB is considered adequate for a simple hard target with

300 pulse powers averaged, but detection of a single pulse requires a SNR of 16 dB [8].

Thus, calculating a typical (mid-range) SR in terms of bandwidth and SNR, we use values

of c = 85 dB-m/s; the number 4 = 6 dB; and typical mid-range values of B = 70 dB-s*‘; and SNR = 8 dB (average value from above example) to obtain the following expression:

SRb.snr = 85dB-m/s/[(6dB)(70dB-s'^)(8 dB)-‘^] = 0.07 m

Thus, a typical uncertainty in R due to SNR and bandwidth is about 7 cm.

6.2 Uncertainty due to Doppler and Range Ambiguities

The Doppler centroid foe can be derived from first principles, assuming that the velocity of the object is much less than the velocity of light, according to the relation:

174 /dc= [2/(A/?)1 [(V,-Vr+) • iRs-Rd] where A is the wavelength, R is the slant range, V'^ and V[ are the sensor and target velocity vectors, and R^ and Rt are the sensor and target position vectors. A pulse’s duration determines the resolution and accuracy in range, while the reciprocal of its duration determines its Doppler resolution and accuracy. Figure 6.1 shows how the use of the Doppler frequency together with the pulse time delay provides the location of each image point in two dimensions. A specific delay time to = 2R{0)/c crosses the Doppler shift foo at s = 0 at the following circle:

I^(0)= j^ + Rg^ + H^ where R is the slant range, x is the ground range in azimuth (i.e., the direction ahead of the sensor, or along track), Rg is the ground range out to the side (i.e., across track), and H is the flying height above the terrain. The equation of the hyperbola for the Doppler shift ypo depicted in Figure 6.1 is the following:

\2VjXfoo\ = |/?(0)/x| > 1 where Vst is the relative velocity and A is the wavelength. The circle and the hyperbola intersect at only four points in the plane of range Rg and along the track distance x. The left/right ambiguity is resolved by knowledge of the side of the platform from which the signal is directed. The branch of the hyperbola is given by the sign of the Doppler shift.

In this way, a terrain point can be loczdized in two dimensions. [10]

175 Do

Figure 6.1. The terrain point is localized in two dimensions at the intersection of the Doppler shift foo in azimuth and the time delay Tq in range. [9]

A set of techniques called pulse compression allows a single pulse to provide both

good range and Doppler resolution. SAR uses pulse burst waveforms, which allow the

pulse to be as short as necessary (to allow the range to be as accurate as desired).

However, along with the use of this waveform comes the problems of range and Doppler

ambiguities. From the ambiguities, the position of both may be uncertain to some degree

because of blind ranges not seen by the radar. This happens because the radar cannot see

a target while it is transmitting. If targets that are further away require too long of a time

to return, then those targets will not be imaged, or they can be imaged in the wrong

places during interpulse periods. Similarly, if something is moving in the terrain during

imaging, it will be displaced from its position in the image. The same situation occurs in the Doppler domain (since it is the reciprocal of the time domain).

It is desirable to select the pulse repetition frequency fp such that the return signal does not overlap with a transmitted pulse. This can occur when a bright point target is

176 located at an azimuth position outside the main Doppler band (i.e., antenna sidelobes), which results in an apparent (ghost) target at the azimuth position:

X i'-X t-ÀRfp/lVst where x, is the position of the target point and V^t is the relative Doppler velocity between the sensor and the target. The uncertainty in the ghost target location is then given by the following expressions:

xlV xr = 1 • fJ-xr

l^xl'/X = -RfJ2Vst»^iX

t^ x lT R — ~ ^p!2V st •

1^x17fd = - ÀR/2Vst* fJ'fd

f^x lT V st — * A ^v«

Values for all of the parameters and uncertainties could not be located, hindering further application of the methodology. However, we can observe that a combination of large values forR a n d fa, a low value for V ^ t and reasonable values for uncertainties could cause significant problems with ghost points in the azimuth (x) direction. (This effect can be minimized through good design.)

Ambiguities in range can be significant for spacebome radars in which several interpulse periods (Xp = 1//^) elapse between the transmission and reception of a pulse.

The phenomena is due to the receipt of ambiguous pulses arriving from the ranges of:

R i j = c{ti+ j/fp)f2 , 7 = ± 1, ± 2, ... ± n/,

Here j, the pulse number, is zero for the desired pulse, positive for preceding interfering pulses, and negative for succeeding ones, while «/, is the number of pulses to the horizon.

The contribution from each ambiguous phase can be found by determining the incidence 177 angle and the backscatter coefficient for each pulse j, in each interval of time i, at a range

delay f,. The incidence angle n,y at a point i and pulse j is given by:

riij = sin'^%sin(}^)/&)]

where Rt is the target distance and Rs is the sensor distance from the center of the earth,

and Yij is the antenna angle the corresponds to n,y [11].

Applying the methodology to the incident angle «,y, we obtain:

fJ-nij/Rs = 1/[1 - [sin'‘[(/?^inyjy)//?J]^]*‘'^ • smyij/Rt • H-rs

U nij/R s = l/[l-[s in ‘‘[(/?^sinyjy)/Rt]]V^*/?^cos)^y//?, •

^ ln ij/R s = l/[l-[sin'[(2(,sin}^)/&]]Y'".(-&)sin}^y/&^./i;(r

These expressions then give the uncertainty of the range ambiguity with respect to each of its parameters. Again, values could not be located for all uncertainties in the above set of three equations, so we proceed no further with the methodology here and continue to the next topic.

6.3 Uncertainty due to Radiometry and Geometry

Radiometric calibration refers to the accuracy in relating an image pixel to the characteristics of the target scatterer in the landscape, while geometric calibration of an image refers to the accuracy in registering a pixel to an earth-fixed grid [11]. Each is discussed below, along with its associated uncertainties.

Although accurate radiometric calibration is essential for quantitative analysis of

SAR imagery, the effect of local topography is often not taken into account. [12] reports that several decibels of error can arise when SAR imagery is not corrected for terrain- related variation in pixel scattering areas, which was confirmed by [13]. Regarding radiometric accuracy, [13] discusses the importance of taking other factors into account 178 beyond the use of calibration techniques and devices. Further, the introduced error can

vary strongly across the image. For airborne platforms (e.g., AirSAR), the effect of local topography is accentuated and extremely large errors, some well larger than 10 dB, can be expected in the near range. The authors also report that the effects of radiometric calibration cannot be completely decoupled from geometric correction. The DEM information needs to be warped into SAR space, but at the same time the S AR image needs to be warped into cartographic space to perform geometric corrections.

The goal of radiometric calibration is “to make reliable, repeatable measurements of target radar cross-section

Internal calibration is the use of built-in devices (such as calibration tone) to characterize the radar system performance. External calibration is the use of ground targets to characterize the system performance, such as comer reflectors or distributed targets with known scattering characteristics (e.g., cf).

The radiometric error model is fully developed by Curlander and McDonough

[15], under the assumptions that the distribution of the estimated errors for each term is

Gaussian and uncorrelated. The backscatter is given as:

(f = iP^-^)/K(R) and the fractional uncertainty in the estimate of ( f in terms of the noise power and the correction factor as:

pioofcf = [QiK/K(R)Ÿ + (jipA (fK (R )fŸ'^ where Pr is the received power, P ^is the mean noise power over a sample block, and

K(R) is the range dependent scale factor representing the combined errors. 179 The various radiometric sources of uncertainty and their calibrated values for the

ERS-1 satellite are given in Table 6.2, showing an absolute uncertainty of 1.26 dB. This was the only numerical data on radiometric uncertainty that could be located.

ERS-1 SAR UNCERTAINTY

Radiometric Stability Uncertainty (dB) Relatfye transponder gain variation ±0.70 Orbital variations Transmitter power ±0.20 Antenna gain ± 0.20 Antenna pattern (roll) ± 0.05 Noise (thermal, quantization) ±0.80 Atmosphere ± 0.26 Pulse replica ±0.10 Ground processor errors ±0.20 RSS Total ± 1.15

Radiometric Accuracy Radiometric stability — RSS total 1.15 Transponder absolute accuracy 0.50 R SS Total 1.26

Stability: similar to relative calibration accuracy Accuracy: similar to absolute calibration accuracy

Table 6.2 ERS-1 SAR Radiometric Uncertainty [after 14]

Geometric calibration is the process of characterizing each of the geometric performance parameters for a given data set. Absolute measures deal with uncertainties in location and image orientation, while relative measures deal with image scale and skew uncertainties. Along-track positional errors cause an uncertainty in the locationAxj of the azimuth target according to:

180 Æc/ = ARxRt/Rs where ARx is the uncertainty in the along-track sensor position vector, Rt is the target position vector, and R^^is the sensor position vector. Cross-track positional errors cause an uncertainty in the location Arj o f the target in range according to:

Arj = ARyR[/Rs where ARy is the cross-track sensor position error.

The absolute pixel location is found by simultaneous solution of three equations for the three unknowns x, y, and z: the earth geoid model; the SAR Doppler equation; and the SAR range equation. The Doppler and range equations were discussed earlier; here we add the earth (geoid) model equation. An oblate spheroid can be used to model the earth’s shape as follows:

(x/ + y M R e + h f + zi^/Rp^ = 1.

These and many other mathematical models are available to describe each of these processes; however, in general the corrections follow standard photogrammetric practice, and will not be further developed here.

Geometric distortion arises mostly from platform ephemeris errors, error in the estimate of the relative target height, and signal processing errors. Geometric calibration is the process of measuring the various error sources and characterizing them in terms of calibration accuracy parameters. Geometric rectification describes the processing step where the image is resampled from its natural (distorted) projection into a format better suited for scientific analysis. Geocoding is the process of resampling the image data into a specific output image format, i.e. a uniform earth-fixed grid such as a standard map

181 projection. Mosaicking is the process of assembling into one frame, several independently geocoded image frames that are overlapping in their area of coverage. [16]

However, both geometric and radiometreic calibration suffer from considerable difficulties in implementation:

“Only in special cases have SAR systems produced radiometrically and geometrically calibrated data products. The implication of poorly calibrated data products on the scientific utilization of the data is far reaching. Without calibrated data, quantitative analysis of the SAR data cannot be performed, and therefore the full value of the data set is not realized.” [1]

6.4 Other Uncertainties in SAR Processes

SAR processing involves the use of a correlator to transform a “smeared” point target phase history into a point image target. This is done by using either a full 2D filter to compensate for the dispersed point target, or two 1-D filtering operations to form a range compressed image. Many different approaches are possible using algorithms involving spectral analysis, range/doppler processing, or 2D wave domain transforms.

A further factor that requires correction is called range curvature, also known as range migration or range walk. This is a curvature of the entire wavefront with respect to the observation path of the sensor. It occurs because signal data from each scatterer appears at larger ranges for azimuth positions that are further away from the zero Dopper azimuth. Range curvature is a significant consideration for all spacebome and most airborne platforms. The correction involves an integration along the entire wavefront.

The number of resolution cells in range spanned by the range curvature is approximately

A rIrR = R ? }l3 2 rR j^a

182 A r Range Curvature

Linear FM approximation

Range Curvature y

Figure 6.2. Wavefront curvature in the range/Doppler domain. Curvature larger than the range resolution r/f requires range curvature correction, [after 17]

where the relations are shown in Figure 6.2. Other sensor errors include: radial positional error leading to azimuthal and cross-track positional error; sensor velocity errors; and target elevation errors.

6.5 Uncertainties in InSAR processing

InS AR-related processing steps and uncertainties will be discussed below as was presented in [18-21]; however, the main uncertainty models were discussed earlier and will not be repeated here.

Single-look SAR complex images must first be generated, then interferometric

(topography) data may be obtained. Earlier the limiting effects of data coherence, temporal decorrelation, steep slopes (layover), radar shadows, vegetation/volume scattering, and other effects were discussed. Here we discuss the minimum requirements

183 for producing acceptable interferometric datasets. It will be seen that some of the

requirements are highly demanding. To generate elevation data the following steps [14]

are applied:

Baseline estimation — From orbital state vectors associated with each single-look

complex image, a cross-track baseline and azimuth (along track) offset are predicted.

Note that from the relations fiz= pfBsmOsmafiBH and pz= p/BsinOsinafiav, and

knowledge of the parameters on the right size of the equation using SIR-C values, we

calculate /Zg = 5 x lO'*pz- This means that in order to achieve a height accuracy of 10

meters in SlR-C, the baseline must be accurate to within only 5 mm.

Coherent Registration — offset measurements are made between the two images,

which must be accurate to better than 0.1 pixels in range and azimuth (1/32 pixel

registration is preferred, when feasible). These results are used to refine the baseline

estimation. This requirement means that if a pixel is misregistered by more than 1/10 of

a pixel, then decorrelation will result.

Resampling — The second image is resampled in range and azimuth to register

with the first image. As with the prior step, a 0.1 pixel accuracy is required.

Coherence Estimate — The level of coherence between the two images is estimated. This allows the expected usefulness and relative accuracy of a particular interferometric pair to be determined.

Interferogram Creation — A multi-look interferogram and a coherence map are produced. The coherence map is used to locate noisy data points.

184 Flattening — The flat terrain phase is removed from the interferogram, using knowledge of the spacecraft geometry. What is left is phase modulation due to topography. The phase is wrapped—that is, it is modulo 2jz.

Phase Unwrapping — One of many possible phase unwrapping procedures is applied to eliminate the modulo 2it ambiguities as much as possible. The resulting data supply relative elevations that are proportional to the terrain.

Rectification — The absolute position of each pixel is calculated, using information on the spacecraft positions, pixel heights, slant range and Doppler parameters. The SAR backscatter image, the derived elevation image, and the coherence image are resampled into an orthoprojection.

Ground Control — Two related distortions need to be corrected (in ERS-1 images, for example), typically using a least squares fit to ground control: 1) vertical scale error from inaccurate knowledge of the baseline length, and 2) tilt in the DEM due to inaccurate knowledge of the baseline tilt.

In addition to errors discussed above and in prior chapters, further sources of error in InSAR are [14]: multiple scattering within and among resolution cells; range and azimuth sidelobes as a result of bandwidth and resolution constraints; range and azimuth ambiguities as a result of design constraints; multipath and channel cross-talk noise that appear as low-level interference; calibration errors; and propagation delay errors from atmosphere and ionosphere.

To mitigate these errors, a number of remedies can be applied. Shadow effects are worse in the far range; layover in the near range. Therefore, shadow and layover can be largely eliminated by flying an area two times in opposite directions. Sidelobes can

185 be weighted in the matched filter function in range or azimuth compression (however resolution is reduced in the process). Both range and azimuth ambiguities rise above the ambient backscatter and can significantly corrupt the interferometric phase. To some degree the range ambiguities (which occur mostly in spacebome systems due to the longer round-trip travel time of the pulse) can be controlled by adjusting the pulse width and the pulse repetition frequency. Azimuth ambiguities are also minimized using the pulse repetition frequency.

As with geometric uncertainties, extensive models are available in [1] to address each of these steps in detail and they will not be repeated here. However, we will present one example for purposes of illustration. A dominant source of uncertainty in terrain height and range in the airborne (dual-receiver) case relates to the pointing accuracy of the look angle 6. From the expressions for the terrain height z = h - pcos0, and the range y = psinG, differentiating with respect to 0 gives the following expressions of uncertainty:

p^0 = psinO • pe

and

fJy/0 = PCOSG • P0

Then, using typical values for the airborne case of pcosG = 12.5 km and po = 0.32 mrad =

0.018° [22], we obtain a vertical uncertainty of 3.75 m, and a horizontal uncertainty also of 3.75 m, due to the uncertainty in the pointing angle alone. It can be noted that because the baseline B is fixed to the fuselage in this case with a known baseline separation, any

186 error in baseline attitude (i.e., angle of tilt between the two receivers) is evidenced by a

systematic error across the entire image. The uncertainty in baseline attitude can be

corrected using known control points in the scene [23].

Motion compensation is also required for accurate airborne interferometry. If left

uncompensated, the interferometric correlation and height accuracy will be reduced. The

motion compensation is performed by adjusting the range and phase of each image

sample according to the reference tracks, shown in Figure 6.3.

Antenna 1 actual track

Reference tracks

Antenna 2 actual track

Figure 6.3. The motion track of airborne interferometers needs to be compensated, in order to obtain accurate positions in the final product, [after 14]

6.6 Summary of System and Processing Effects

The major factors influencing the uncertainties related to the antenna/receiver and

processor functions are summarized below in Table 6.2. Although all of the factors have

some associated uncertainty, to a large degree they are correctable with careful attention

and with a strong understanding of the uncertainty of the final result. However, a major

influence that cannot be easily corrected arises from the natural variation of the terrain. It can be noted from empirical results that the greater the terrain relief, the greater are the associated uncertainties in x, y, and z [24]. 187 Type of Major D efined Mathematical Dominant Uncertainty M easurand Model Uncertainty Noise SR due to noise &R = c/[4R(SNR)'^] ~5dB Range & Doppler Time and range At = 2A/c Sacrifice At or Ar Ambiguities ambiguities Radiometry and in azimuth Ac/ = ARxRflRs 1 dB to > 10 dB due Geometry fx in range A rt = ARJRJRs to terrain Other factors Range curyature Arlrji =R)cl{32.rjirA)Not ayailable InSAR ft in height JV0 = psin0 • H e Typical: 4 m for /r in range /X y/e = pcos0 • airborne due to 0

Table 6.2. Summary of Major System and Processing Effects on the Uncertainty of S AR/InS AR Measurements

6.7 References

[1] Curlander, John C. and McDonough, Robert N., Synthetic Aperture Radar Systems and Signal Processing. New York: John Wiley & Sons, Inc., 1991, pp. 3-4.

[2] Ibid.. 647 pages.

[3] Ibid., pp. 108-119.

[4] Toomay, J.C., Radar Principles for the Non-Specialist. Second Edition. Mendham, New Jersey: Scitech Publishing, Inc., 1998, p. 83.

[5] Ibid.. p. 87.

[6] Henderson, Floyd M. and Lewis, Anthony J., Editors, Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. New York: John Wiley & Sons, Inc., 1998, pp. 813-814.

[7] ------, Radar Principles for the Non-Specialist, p. 195.

[8 ] ------, Synthetic Aperture Radar Systems and Signal Processing, p. 20.

[9] Ibid, p. 19.

[10] Ibid., p. 20.

188 [11] Ibid., p. 11.

[12] Luckman, Adrian J., “Correction of SAR Imagery for Variation in Pixel Scattering Area Caused by Topography,” ŒEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. I, January 1998, pp. 344-350.

[13] van Zyl, J.J., Chapman, B.D., Dubois, P., and Shi, J., “The Effect of Topography on SAR Calibration,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 5, September 1993, pp. 1036-1043.

[14] Curlander, John C., Caranda, R., and Rosen, P., “Short Course on Interferometric Synthetic Aperture Radar: Theory and Applications,” held at The Ohio State University on November 20-22, -1996.

[15 ] ------, Svnthetic Aperture Radar Svstems and Signal Processing, p. 322-327.

[16] Ibid.. pp. 370-410.

[17] Raney, Keith R., “Radar Fundamentals: Technical Perspective.” from Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2, Editors Henderson, Floyd M. and Lewis, Anthony J., New York: John Wiley & Sons, Inc., 1998, p. 91.

[18] Madsen, Soren and Zebker, Howard A., “Imaging Radar Interferometry,” Ibid.. pp. 362-363.

[19] Madsen, Soren N. and Zebker, Howard A., “Imaging Radar Interferometry,” edited by Floyd Henderson and Anthony Lewis, Manual of Remote Sensing. Third Edition, Volume 2. New York: John Wiley & Sons, Inc., 1998, pp. 362-363.

[20] Rodriguez, E. and Martin, J.M., “Theory and Design of Interferometric Synthetic Aperture Radars,” IEEE Proceedings-F, Vol. 139, No. 2, April 1992, pp. 147-159.

[21] Rodriguez, E., Imel, D., and Madsen, S.N., “The Accuracy of Airborne Interferometric SARs,” IEEE Transactions on Aerospace and Electronic Systems, 1995.

[22] Zebker, H.A. and Villasenor, J., “Decorrelation in Interferometric Radar Echoes,” IEEE Transactions on Geoscience and Remote Sensing,” Vol. 32, No. 5, 1992, pp. 950- 959.

[23] Zebker, H.A., Werner, C., Rosen, P.A., and Hensley, S., “Accuracy of Topographic Maps Derived from ERS-1 Interferometric Radar,” TFFF. Transactions on Geoscience and Remote Sensing, Vol. 32, No. 4, 1994, pp. 823-836.

189 [24] Norvelle, F. Raye, “Evaluation of ERIM’s ‘IFS ARE’ Digital Elevation Models in Support of ‘GeoSAR’,” prepared for the Defense Advanced Research Projects Agency by the U.S. Topographic Engineering Center, Alexandria, VA, December 1996,41 pages.

190 CHAPTER?

UNCERTAINTY IN CLASSIFICATION MODELS AND ALGORITHMS

This chapter begins with a short review of what has been discussed up to this

point. Chapter 2 described shortcomings in the way that S AR/InS AR data are classified,

as observed in the scientific literature. The main observation was that the basic scientific

method was not being followed in the design and conduct of experiments, in the

following ways:

• Experimental repeatability and reproducibility are not addressed (important for

production)

• Explanatory power is limited to a given dataset (often a single scene)

• The null hypothesis is not disproved (the results do not definitively demonstrate

that new method “B” is better than existing method “A”)

Chapter 3 examined accuracy standards, while Chapter 4 presented a new methodology

for the analysis of uncertainty in the end-to-end processes involved in creating thematic

maps from S AR/InS AR data. That methodology was applied to the sensor-scene interaction in Chapter 5, then to antenna and processor functions in Chapter 6.

The next steps, covered in this chapter, involve a study of the uncertainties involved in: (1) the creation of a series of intermediate products to be operated on by the

191 classifier, and (2) the model or algorithm used to classify the scene. We leave the discussion on validating results (in reference to ground truth) to Chapter 8. It can be noted that most of the material in Chapters 7 and 8 apply not only to thematic map output from SAR/InSAR data, but also to other types of remotely sensed image data that are simplified (i.e., generalized or classified) for particular uses.

First we recognize that the input to these models and algorithms is comprised of a set of derived data, created entirely from the three parts of the basic signal: the amplitude, phase, and polarization. Differences, ratios, simple spatial and spectral statistics, and other variations are applied to the data to generate a set of derived, intermediate products, such as those shown in Table 7.1. Except for the magnitude image, these intermediate products are only useful (in the process under study) as inputs to the classification models and algorithms. While data from a single antenna, single­ polarization, and single-frequency SAR allow a few intermediate products to be generated for a given scene, a quadrature-polarization (i.e., HH, W , HV, and VH) 3- frequency SAR in bistatic (two-antenna, two-receiver) mode allows a whole host of intermediate products to be generated. These intermediate products can be generated from single or multiple datasets captured at different spatial frequencies, polarizations, times, system parameters, and antenna configurations. In addition, the same scene imaged at multiple times offers the ability to create even more intermediate products.

Li our analysis, we will assume that all necessary corrections as discussed in

Chapters 5 and 6 have been completed prior to this stage, such as corrections for noise, radiometry, geometry, ground referencing, ambiguities, LiSAR baseline, etc. Much of

192 • Magnitude image • Interferogram • Digital elevation model • Normalized magnitude ratios • Depolarization ratio • Polarimetric covariance matrix • Coherence map • Correlation • Correlation gradient • Phase differences • Volume decorrelation • Stokes polarimetric matrix • Elevation gradient

Table 7.1. Examples of intermediate data products derived from a SAR signal.

the discussion will necessarily be qualitative due to the nature of the topic, but later in the chapter a more quantitative approach will be explored.

7.1 Problems in the Current Modus Operandi and Examples

Perhaps the most direct way to understand shortcomings of the current modus operandi of SAR/InSAR classification models and algorithms is to look at a set of examples. There are literally thousands of studies that explore different aspects of

SAR/InSAR classification models and algorithms. For nearly all of these studies, the two dominant approaches rely on spatial statistics and/or physics-based models. For the stated purpose of demonstrating the methodology’s usefulness here, three recent studies will be examined. These studies are generally representative of the full range of studies, as noted by a thorough examination of a wide range of topics covered in journal articles

[1-27]. 193 The first study [28] presents a statistical approach for determining the area occupied by a building footprint (that is, the ground area upon which a building structure stands). Such a capability would be useful in generating building outlines (i.e., footprints) for various types of urban assessments. The second study [29] presents a physics-based approach for estimating crop and soil conditions, which if successful could be useful to farmers and decision makers. Finally, the third study [30] is another statistical approach for classifying natural forest and grassland areas, perhaps useful to foresters and environmentalists.

As before, we consider the measurand to be the quantity that is being determined or measured—as that measurement is represented in a particular model or algorithm. The three studies are summarized in Tables 7.2, 7.3, and 7.4. From examining the studies, the following overall observations can be made:

1. Bias is evident in the selection of dataset(s) that are likely to produce

favorable results for the algorithm under study.

2. Bias is evident in the choice and grouping of feature classes to fit the

characteristics of the dataset.

3. Considerable (and time-consuming) adjustment of parameters is typical, in

order to optimize results on a given scene—which are likely not to perform

well on a different scene.

4. Comparisons with ground truth are neither rigorous nor carefully controlled, if

done at all.

194 5. Results tend to be presented in a carefully constructed manner to make the

results look good, even in cases where a careful examination reveals that

results are, in fact, quite poor.

For example, a typical statement in a refereed paper is, “the model

results can only be applied to a set of plant and soil characteristics identical to

those used in calibration,” accompanied by statements such as, “...we feel that

these weaknesses will not necessarily defeat our objective, which was simply

to demonstrate a concept.” [3, italics added] Also, in the same paper the

linear model of the form y=x is compared to measurements on a graph,

showing a wide scattering of measurements: the only clear relationship is that

the measurements (randomly) relate to the model to within a range of

magnitude.

6. Highly sophisticated and complex models are proposed, then tested under

highly restrictive conditions; however, the results can scarcely justify the

amount of effort. It is likely that many of these studies represent years of

intensive work.

It also became clear in the analysis of each study that a strong understanding of the methodology (as covered in previous chapters) aided in understanding the shortcomings and limitations of a given classification model or algorithm.

195 From “Digital Surface Models and Building Extraction: a Comparison of IFSAR and LIDAR Data,” by P. Gamba and B. Houshmand, 2000. [28]

Background: The building footprints (i.e., outline of the ground area occupied by a building) were determined for part of an urban area, using DEMs created from IFSAR (i.e., InSAR) data.

Measurand: The measurand Y is given as a combination of 3 input variables: lines, planes, and step values as calculated across the dataset, from IFSAR-derived elevation posts.

Model: Primarily geometric and threshold-driven; details not given.

Data: A single sub-scene of eight buildings in downtown Los Angeles was used. The DEM density was 5 meter post spacing. All buildings “happened” to line up parallel with the flight line, which created lines on the image marking the dihedral angles between the building walls and the ground. These lines were used to begin the search process in the algorithm.

Ground Truth: Orthophoto (no associated uncertainty is given).

Results: The footprint areas were determined. The measurement error for individual buildings ranged from a low o f—58% (underestimated area) to a high of +77% (overestimated area) compared to the ground truth. A quick calculation showed that the IFSAR-derived footprint area was 21.1% higher than the “true” footprint area on the orthophoto, while the standard deviation was 47.8%.

Limitations: Significant shadowing and layover were reported, along with artifacts from multiple scattering events. Thresholds and parameters were determined interactively by experimentation to minimize error. The pre-selection of a portion of the scene must be criticized, in that the 8 buildings tested were parallel to the flightline and of a large enough size (i.e., large downtown buildings) to allow many post spacings per building. Further, the scene was apparently treeless or nearly so, which considerably simplified the interpretation problem.

Table 7.2. Analysis of Study to Determine Building Footprints.

196 From and C-Band SAR for Discriminating Agricultural Crop and Soil Conditions,” by M. Susan Moran et al., 1998. [29]

Background: Both the green leaf area index (GLAI) and the soil moisture were to be determined for an agricultural area.

Measurand: Expressed as the combined backscatter of a model that relies on a set of input parameters that were mostly field-measured and entered by the user: (1) volumetric moisture content, (2) the modeled backscatter from vegetation and (3) soil, (4) vegetation canopy descriptors, (5) 2-way attenuation through canopy, and (6) 4 model coefficients determined from the experimental data.

Model: The general water-cloud model using the above parameters.

Data: Spring and summer airborne C and Ku band magnitude images (2 of each); and corresponding spring and summer ERS-1 spacebome images. Data were captured at different times of day and night; the ERS-1 images were captured up to 10 days apart from the airborne images. All tests were of agricultural fields containing two types of crops: cotton and alfalfa.

Ground Truth: A visual survey was performed using air and ground measurements at about the same time as airborne overflights. The signal was calibrated on bare fields. No estimation of uncertainty from the ground truth was offered.

Results: Results were presented in the form of scatterplots that were difficult to interpret. No quantitative assessment was offered.

Limitations: Surface roughness was not included in the model, yet this is known to be a dominating influence on backscatter. Very detailed physical field measurements were required to obtain the model parameters. The method can only be applied to fields that have a similar soil roughness and the same row directions. The average backscatter is measured by selecting areas on the images ranging anywhere firom 14 to 1000 pixels per field depending on the size of the field; however, the amount of variance in either the ground truth or the pixel measurements was not given. Many other specific limitations were described.

Table 7.3. Analysis of Study to Determine Crop and Soil Conditions. 197 From ^‘Segmentation and Classification of Vegetated Areas Using Polarimetric SAR Image Data,” by Y. Dong, A. K. Milne, and B. C. Forster, 2001. [30]

Background: An image of a natural, vegetated area is classified by area statistics using polarimetric AIRSAR data.

Measurand: Expressed as a scalar texture measure, as a function of area differences in first order first degree statistics, first order second degree statistics, and texture.

Model: The model groups spatial variations across the image (pixel area based).

Data: AirSAR P, L, and C band polarimetric data of two naturally- vegetated sites.

Ground Truth: None.

Results: 10 landscape classes on one scene and 12 on the other.

Limitations: The size of the datasets was unusually small, only 512 x 512 pixels. This may be because the model did not perform well on other areas of the scene that were therefore excluded from the study, or perhaps the algorithm was so computationally-intensive that a larger image required too long to process. The classes were largely distinguishable visually, according to image texture. Class groupings appear to be chosen so that the classes align with the visual and statistical separability of the features.

Table 7.4. Analysis of Study to Classify Vegetated Areas. 198 7.2 Applying the Methodology to the Determination of Uncertainty in

Classification Models and Algorithms

It would be interesting and instructive to do a classification study to elicit the ways in which a model or algorithm fails: that is, to determine the limits of its ability.

Such information could be helpful in moving the technology towards a production environment. It would help in understanding the true uncertainties involved, not just the much smaller amount of error observed under tightly controlled conditions. Further, according to the scientific method, we should not seek confirmation that a particular method works well (because anyone can do that at least once!), but rather determine where, how, and why the model or algorithm fails.

However, the normal process in applying classification models and algorithms is that many different values for the given parameters are attempted, and only the “best” results are reported that apparently minimize the amount of error. Thus, the methodology cannot be directly applied to the studies, since only a single optimized value (i.e., the final result) is typically reported for each parameter (if it is reported at all) and uncertainties of the parameter values are not provided. Let us then construct a way in which the methodology can be applied, by following the principles laid out earlier on repeatability and reproducibility. The question can be posed in the following way: how much uncertainty is introduced into the classification process by the particular model or algorithm used? This question is important for two reasons: (1) every model or algorithm simplifies a more complex reality, so by its very nature there is a degree of uncertainty associated with the simplification; and (2) we are not determining the true measurand, only estimating its probable value, therefore repetitive measurements are

199 necessary to determine the measure of uncertainty involved in the operation of the model or algorithm. We will set aside the first reason since it is a complex topic in its own right, and focus on the second, to which the methodology can be applied with some adjustments.

Following the methodology, a hypothetical example is constructed for applying the concepts of repeatability and reproducibility to the classification model or algorithm, shown in Table 7.5. (Comparison with ground truth will be examined in the next chapter and will therefore not be considered here.) The emphasis is on determining the uncertainty of each parameter, according to the range of parameter values found to satisfy the repeatability and reproducibility criteria. It should be noted that in order to make equivalent comparisons, the number and types of classes used must be consistent throughout the testing process for each dataset.

If it can be shown that the range of uncertainties for repeatability and reproducibility is too great, then the algorithm should be recognized as having poor distinguishability under the given conditions. If several different algorithms are tested for repeatability and reproducibility on the same dataset and all do poorly, it may indicate that the SAR/InSAR data itself has too much ambiguity for those classes and therefore cannot be relied on for similar cases (or it may imply that the particular frequency, polarization, algorithm, etc. tested do not suit the classes, but another combination may).

1 3 Applying the Methodology to Determine Uncertainty in a Sample Case

Here we apply the methodology to the study summarized in Table 7.3 to discriminate crop and soil conditions, based on the water cloud model. Table 7.6 shows the coverage of the methodology as it relates to this example. The water cloud model

200 APPLYING THE METHODOLOGY TO ESTABLISH REPEATABILITY AND REPRODUCIBILITY For classification models and algorithms

1. Perform repeatability tests on the data being classified:

a. Select a dataset and have the same person apply the same model or algorithm to the same data several times, using the same equipment, the same sampling method (when applicable), and the same set o f feature classes. Each time, the person should perform the end-to-end classification process while optimizing the parameters, treating the process as an independent experiment each time (i.e., the parameters from the prior time should not be reused, but instead redetermined). The same sampling and processing methods and classification criteria should be applied in each case, without reference to the earlier classiRcations (which could bias the results).

b. When the repeated tests on a given dataset are completed, determine the mean and the uncertainty for each parameter from the range of values used in the repeated tests. (It also may offer insight to compare the final classification results o f the repeated tests using a truth table or some other means.) The analyst should attempt to explain any significant, inconsistent results.

c. Use the values o f the mean and uncertainty for each parameter in the partial derivatives of the mathematical model; then determine the combined uncertainty.

d. Repeat steps a, b, and c for several different scenes that represent a variety of terrains, ecosystems, seasons, and other factors of importance. It is anticipated that results will highlight faulty or ambiguous methods and assumptions.

2. Perform reproduclbiiitv tests on the data being classified:

Repeat steps a through d above, except that the duplicate testing is now performed bydifferent people, at different locations, usingdifferent equipment and different sampling processes. The dataset, the model or algorithm, and the set o f feature classes remain the same for each dataset tested.

3. Reporting

If the results are reasonably consistent, it may be possible to use these as ‘calibrations’ or benchmarks to represent typical uncertainties for the model or algorithm on somewhat different scenes, using the same type o f data from the same sensor, angle of incidence, etc. For example, a statement such as the following might be entered on production maps that use the same model or algorithm: “Based on a series o f benchmarks tests, the ambiguities in the classification model (or algorithm) that created this type of map have been generally shown to comply with a standard combined uncertainty o f 5% in repeatability and 10% in reproducibility.” As in standard industrial practice, a follow-up program of random, continued sampling of production maps is recommended to maintain quality control.

Table 7.5. An example methodology for establishing repeatability and reproducibility in model and algcmthm classification. Chapter 7: Overview of Uncertainty Analysis

Step 1. Step 2. Step 3- Step 4. Step 5. Step 6. Step 7. Define Math. Inputs Find Find Report Repeatab. measurand model Xi At(Xr)’S lie results & Reprod. Classification models and algorithms using example of Section 7.3.

• Quantity given C Quantity partly given O Numerical value only

Table 7.6. Overview of Uncertainty Analysis in Classification Models and Algorithms. The standard methodology can be followed by any model or algorithm represented by a mathematical function. The degree of completion is indicated by the circles on the chart. Columns 6 and 7 would be completed in a testing environment.

202 gives the backscattered power for the whole canopy ( f as the sum of the vegetation

contribution and the underlying soil contribution, as attenuated by the vegetation layer

above it. The full model is given by:

c f = (AVcos0{ 1 - [exp(-2BV/cos0]}) + (C + DAv)

where V is a canopy descriptor, coefficients C and D are determined from a linear regression of c f with hy (note: the expression C + is given in decibels, while the other quantities are regular unitless numbers), and A and B are determined by fixing D and minimizing the sum of squares of the differences between the modeled and measured cf.

From this method, the study obtained the following parameters for com crops at C-band, for an incidence angle of 0 = 23° at W polarization: A = 0; B = 0.09; C = -8.5, and D =

27.8; hy was measured as 0.1, and V was found to be 1.5 ± 0.5. [29] The measurement uncertainty was only provided for V. Our task is then to determine the uncertainties for all of the parameters, both under the repeatability condition and under the reproducibility condition, using the method described in Table 7.5. Once the uncertainties are determined, they can be used in the following equations of the partial derivatives and the steps that follow, according to the methodology:

= Vcos0(l - exp(-2BV/cos9)) = 1.38 • (1 - .293) = -.471

jUoo/AtB = -2AV"cos0(exp(-2BV/cos9)) = 0

A^oo/Akz — 1

H oc / — h y — 0.1

jMoo/Afv = Acos0(exp(-2BV/cos0)) = 0

203 /^od /A^ = {-AVsin0(l-exp(-2BV/cos0))} + {AVcos0(-exp(-2BV/cos0))

• (-2BVsin0/cos^0)} = 0

A^ot//^hv = D = 27.8 dB = 600

The combined uncertainties can then be determined in the same manner as in previous chapters. (The above equations show that the parameter D dominates the model by a wide margin.)

Thus, the key to determining the uncertainty of the output V in classification models and algorithms, is to determine the uncertainty in the respective parameters (Xi,

X%, ..., X„) that are input to the model. A simple way to view this is that the uncertainty of a parameter represents the ambisuitv in measuring or determining that parameter, both by the same person under the same conditions (repeatability), and by different people under somewhat different conditions (reproducibility).

7.4 References

[1] Ito, Yosuke and Omatu, Sigeru, "Land Cover Mapping Method for Polarimetric SAR Data,” SPIE Proceedings 3070. 1997, pp. 388-397.

[2] Liu, Gouqing; Huang, Shunji; et al., “Bayesian Classification of Multilook Polarimetric SAR Images with Speckle Model,” SPIE Vol. 3070.1997, pp. 398-405.

[3] Foody, Giles M.; McCulloch, Mary B.; and Yates, William B., “Classification of Remotely Sensed Data by an Artificial Neural Network; Issues Related to Training Data Characteristics,” Photogrammetric Engineering and Remote Sensing, Vol. 61, No. 4, pp. 391-401, April 1995.

[4] Coltelli, Mauro; et al, “SIR-C/X-SAR multi frequency multipass interferometry: a new tool for geological interpretation,” Journal of Geophysical Research, 10/25/96, Vol. 101, Iss. BIO, pp. 23127-48.

[5] Lim, H.H.; Swartz, A.A.; Yueh, H.A.; Kong, J.A.; and Shin, R.T., “Classification of Earth Terrain using Polarimetric Synthetic Aperture Radar Images,” Journal o f Geophysical Research, Vol. 94, No. B6, pp. 7049-7057, June 10, 1989.

204 [6] Filho, Otto; Treitz, Paul; et al, ‘Texture Processing of Synthetic Aperture Radar using Second-Order Spatial Statistics,” Computers and GeoSciences, Vol. 22, No. 1, pp. 27-34, 1996.

[7] Tzeng, Y.C. and Chen, K.S. “A Fuzzy Neural Network to SAR Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. I, January 1998.

[8] Moran, M. Susan; Vidal, Alain; Troufleua, Denis; Inoue, Yoshio; and Mitchell, Thomas A. “Ku- and C-Band SAR for Discriminating Agricultural Crop and Soil Conditions,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 1, January 1998, pp. 265-272.

[9] Irving, William W., and Novak, Leslie M. “A Multiresolution Approach to Discrimination in SAR Imagery,” IEEE Transactions on Aerospace and Electronic Systems, Vol. 33, No. 4, October 1997, pp. 1157-1168.

[10] Solberg, A.H.S., and Jain, A.K., ‘Texture Fusion and Feature Selection Applied to SAR Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, March 1997, pp. 475-479.

[11] Xia, Zong-Guo and Henderson, Floyd M. “Understanding the Relationships between Radar Response Patterns and the Bio- and Geophysical Paramaters of Urban Areas,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 93-101.

[12] Henderson, Floyd, “SAR Applications in Human Settlement Detection, Population Estimation and Urban Land Use Pattern Analysis: A Status Report,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 79-85.

[13] Solberg, Anne H., Taxt, Torfinn, and Jain, Anil. “A Markov Random Field Model for Classification of Multisource Satellite Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 34, No. 1, January 1996, pp. 100-113.

[14] K. S. Chen, W. P. Huang, D. H. Tsay, and F. Amar. “Classification of Multifirequency Polarimetric SAR Imagery using a Dynamic Learning Neural Network,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 34, No. 3, May 1996, pp. 814-820.

[15] Y. Hara, R. G. Atkins, S. H. Yueh, R. T. Shin, and J. A. Kong, “Application of Neural Networks to Radar Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 1, January 1994, pp. 100-109.

205 [16] Wong, Yiu-fai and Posner, Edward C. “A New Clustering Algorithm Applicable to Multispectral and Polarimetric SAR Images,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 3, May 1993, pp. 634-644.

[17] Ceccarelli, Michele, and Petrosino, Alfredo, “Multi-feature adaptive classifiers for SAR image segmentation,” Neurocomputing 14 (1997) 345-363.

[18] Smits, Paul C., and Dellepiane, Silvana G. “Synthetic Aperture Radar Image Segmentation by a Detail Preserving Markov Random Field Approach,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 4, July 1997, pp. 844-857.

[19] Rignot, Eric, and Chellappa, Rama. “Segmentation of Polarimetric Synthetic Aperture Radar Data,” IEEE Transactions on Image Processing, Vol. 1, No. 3, July 1992, pp. 281-300.

[20] Rignot, E., and Chellappa, R. “Segmentation of synthethic-aperture-radar complex data,” Journal Optical Society of America A, Volume 8, No. 9, September 1991, pp. 1499-1509.

[21] C.V. Stewart, B. Moghaddam, K. J. Hintz, and L. M. Novak, “Fractional Brownian Motion Models for Synthetic Aperture Radar Imagery Scene Segmentation,” Proceedings of the IEEE, Vol. 81, No. 10, October 1993, pp. 1511-1522.

[22] Wegmuller, Urs, and Wemer, Charles. “Retrieval of Vegetation Parameters with SAR Interferometry,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 18-24.

[23] Wegmuller, Urs, and Wemer, Charles L. “SAR Interferometric Signatures of Forest,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 33, No. 5, September 1995, pp. 1153-1161.

[24] Raghu, P. P. and Yegnanarayana, B. “Multispectral Image Classification using Gabor Filters and Stochastic Relaxation Neural Network,” Neural Networks, Vol. 10, No. 3, 1997, pp. 361-472.

[25] C. L. Williams, K. McDonald, E. Rignot, L A. Viereck, J.B. Way, and R. Zimmermann, “Monitoring, classification, and characterization of interior Alaska forests using AIRSAR and ERS-1 SAR,” Polar Record 31(177), 1994, pp. 227-234.

[26] Freeman, A., and Durden, S.L. “A three-component scattering model for polarimetric SAR data,” a Jet Propulsion Laboratory Research Report sent electronically by Anthony Freeman on 2/9/99, 30 pages, (date unknown)

206 [27] Freeman, A-, Chapman, B., and Alves, M. MAPVEG Software User's Guide, JPL Document D-11254, Jet Propulsion Laboratory, October 1993.

[28] Gamba, Paolo, and Houshmand, Bigan, “Digital Surface Models and Building Extraction: A Comparison of IFSAR and UDAR Data,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 4, July 2000, pp. 1959-1968.

[29] Moran, M.S., Vidal, A., Troufleau, D., Inoue, Y., and Mitchell, T., “Ku- and C- Band SAR for Discriminating Agricultural Crop and Soil Conditions,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 1, January 1998, pp. 265-272.

[30] Dong, Y., Milne, A.K., Forster, B.C., “Segmentation and Classification of Vegetated Areas Using Polarimetric SAR Image Data,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 2, February 2001, pp. 321-329.

207 CHAPTERS

UNCERTAINTY IN MAP OUTPUT REQUIREMENTS

Chapter 7 discussed the process of classification; in this chapter the output of

classification is addressed. In principle, the literature should offer meaningful

demonstrations of how well a particular sensor (such as SAR/InS AR) and algorithm can

be used to distinguish certain classes; what levels of ambiguity can be expected under

what conditions; and what are the limitations (i.e., boundary conditions).

However, there is often a mismatch between the classes set by operational

requirements (specifications) of an organization, versus the typical choice of classes of

convenience and distinguishability used by a researcher. If classes in a SAR/InS AR

dataset, such as trees and shrubs, cannot be easily distinguished from each other either

visually or statistically, a researcher can simply designate the class “trees and shrubs”. In

addition, a researcher has the luxury to adjust parameters extensively on a particular

scene to achieve a minimum-error result; while production mapping needs to move quickly and efficiently through the necessary steps with a minimum of delay. In these ways a researcher may claim that his or her method is better than another, because it achieved “95% accuracy” while a competitor’s method only attained a “90% accuracy.”

No matter that the classes were different, the scenes were probably different, and any

208 number of other variables came into play. Thus, the level of accuracy attained by a researcher may be very much different from that attainable in production, than when class descriptions are determined a priori as is typical in production mapping.

The end result of a classification of SAR/InS AR data is usually a thematic map.

This map is often created and manipulated in a geographic information system (GIS).

The term GIS is used here rather loosely to include systems that classify and/or analyze remotely sensed data. In order to understand the full range of uncertainties involved in map production, we now turn our attention to the final stage: the examination of map output requirements, according to two major areas. These areas are not exhaustive, but rather are intended to illustrate the need for a range of broader standards in map output requirements, particularly as it pertains to feature classification.

This chapter builds on the findings of an earlier project on Large-Scale Feature

Classification conducted under the auspices of the Ohio Geographically-Referenced

Information Program and sponsored primarily by the Federal Geographic Data

Committee (FGDC)‘. The project set out to match feature descriptions of one GIS database to those in another according to a hierarchical framework. It was presumed that local organizations that map at larger scales have more detailed sets of feature descriptors than federal agencies that map at smaller scales. If larger-scale data could be aggregated to clusters of more general data, it might be possible to use existing local data to assemble a national spatial database according to the outline of the National Spatial Data

Infrastructure (discussed later in the chapter). However, that presumption proved to be invalid as the project unfolded owing to the widely differing ways that (even the same) features were classified and aggregated in every class for every dataset. In some areas the

209 federal datasets were far more detailed than the local ones, and classes overlapped in

irregular ways. Even when the feature name was the same (as in “roads”), the

descriptions differed on what comprised the feature (see Table 8.1). The project pointed

to the clear need for feature classiEcation standards for similar communities of users.

While the prior chapter dealt more with the mechanics of the act of classification,

in this chapter output requirements are examined in terms of the uncertainty involved in

classifying data according to a standard set of feature classes.

A general, brief discussion on GIS databases as they pertain to feature

classification will be offered. The current state of feature classification standards will be

discussed. A method will be offered for determining the associated uncertainties. The

discussion in this chapter will be more qualitative than quantitative, because relatively

few quantitative measures exist in these areas.

8.1 The Evolution of Standards in GIS Databases

In this section the evolution of certain standards related to GIS will be discussed.

The discussion begins with general observations on new product development and the introduction of standards, followed by a discussion of GIS as an instance of a product and the recent emergence of some GIS standards. Standards play a key role in making a quality product that is reliable, safe, repeatable and reproducible to within known tolerances, which is compatible with other, similar products and infrastructure.

When a new product is in its early, formative stages, a researcher, scientist, or engineer is typically free to develop a concept or carry out a study with relatively few constraints. Usually, the ability to meet functional requirements is the main goal in early development. However, once the basic idea is fully developed, a range of existing

210 standards must be followed (or new ones developed), to: be in compliance with

government regulations; be compatible with existing products that are in widespread use;

and manufacture a product within quality control standards so that the final product is consistent and reliable (as well as repeatable and reproducible to within certain tolerances, per earlier discussions).

For example, suppose that a company is designing a toaster for household use.

The product must meet certain functional requirements, such as: the ability to hold two slices of varying thickness; adjustable timing; automatic pop-up, etc. With this in mind, concepts are generated on what the toaster will look like and its internal mechanisms are designed. Prototypes are manufactured and tested; and finally, the toaster is mass- produced and distributed for sale. Since a toaster is a mature product (i.e., it has been manufactured for many decades), then throughout this process, design and manufacturing standards are applied. The toaster needs to be safe to operate so a user doesn’t get burned or shocked; in the U.S. it needs to have electrical compatibility with 110-120V a.c. power so that it can draw on existing electric current, and so on.

There is a balance between standards and innovation. On the one hand, too many standards (especially when imposed very early on a rapidly-evolving product) make it difficult to generate new products because the parameters are tightly controlled, thus severely restricting innovation. But on the other hand, too few standards can result in a product that is of irregular quality and reliability, perhaps dangerous to the consumer, and incompatible with existing infrastructure (such as the electrical supply). The adoption of

211 standards signals a certain level of product maturity by recognizing the need for

consistency in the results, and perhaps improved inter-compatibility among similar

products.

As seen in the standards review in Chapter 3, the era of widespread GIS

standardization has begun. However, the relative lateness of their arrival means that

many thousands of independent GIS databases have been established, at great cost.

These databases have been designed and populated according to unique perspectives.

The development of these individualized databases means that we now have a national

(and international) situation whereby datasets are largely incompatible with each other,

both within the same software environment and between different environments. Even

within the same software environment (such as ESRI), databases are typically unable to

share similar information because different database organization and class designations

for each make it difficult to combine and transparently share information with others. In

essence, in the current GIS environment we now have many thousands of different

“toaster” designs, few that work well with each other and/or have the ability to share or

combine data from different sources easily (unless a translater is developed, such as for

the U.S. Census Bureau’s TIGER files).

The concept behind the problem as it relates to how features are classified in GIS databases is depicted in Table 8.1. The figure presents an abbreviated set of class descriptors, as might be recorded by three different agencies in the same county. In this case, it is possible that different databases are needed to some extent, because the purposes for which the data are to be used are to some degree mutually exclusive, as well as carrying different requirements for positional accuracies. In such cases separate

212 Database from Agency 1 Database from Agency 2 Database from Agency 3

ROADS = primary & secondary ROADS = secondary & tertiary ROADS = all paved roads only as centerlines, as roads only as depicted by surfaces, including roads, digitized from paper quadrangle road edges, alleys excluded, alleys, driveways, and maps 1:24K Pos. acc. = 6 0 ’ from aerial photos. Pos.acc.= 2’ parking lots, defined by edge of pavement. Pos.acc.=2’

WATER = major rivers (over WATER = major and secondary WATER = max. extent of flow­ 50’ across) and standing water rivers (over 20’ across) and ing or standing water at any time (over 200 acres in size) year-round standing water over of the year with total surface Positional Accuracy = 60’ 20 acres in size. Pos. acc.=2’ area over 1 acre. Pos.acc.=2’

PURPOSE = statewide PURPOSE = for transportation PURPOSE = input for surface resource management planning and infrastructure runoff models for storm water management management planning

Table 8.1. Feature class descriptors for three hypothetical databases that serve different purposes. However, in the absence of classification standards, even for similar organizations, the same feature classes tend to be grouped differently, and cannot therefore be readily mapped to each other. (See Appendix A for further discussion.)

database designs may be warranted, so long as the users are only interested in that single

county and do not need to overlay the data, or do an analysis in a GIS beyond the

confines of the single dataset. However, geospatial data typically ends up being

combined with other data and used in applications well beyond its original, intended

purpose. Yet the one-of-a-kind, unique design parameters also have the effect of

substantially limiting the potential applications that may emerge.

Let us now envision a situation in which a range of standards have been applied to develop a GIS database. Accuracy, consistency, and other standards have been used to develop "sister" GIS databases from the beginning, which carry the same features at the same scale. Now suppose that a regional planner or researcher needs to combine (i.e., tile) data from two different counties for analysis and planning; or a statewide agency needs to combine local data for statewide resource management. Let us suppose that

213 everything in the database design is identical, except that each county has decided to

assign its features to somewhat different class groups than the other counties. Yet even

this “small” limitation prohibits the data from being readily combined! (See discussion

on classification in Appendix A.) This situation may compel the regional planner to re­ create another entire dataset of the same features, simply due to the non-homogeneity of feature classifications across counties.

The data-sharing problem can also be described by the mathematical concept of mapping (or assigning) elements from one domain into another: i.e., when several objects are grouped into one (many-to-one), it is unlikely that the grouped objects will be able to be ungrouped later (one-to-many) without an a priori structure and design that allows it to do. Such information is not typically provided in a GIS.

An exception for reasonable data sharing is when users are able to work with the database exactly as structured using an interface developed by the same database producer, such as internet access to a particular GIS dataset. The user will, however, be constrained by the limitations imposed by the specific interface in that case. When using an already-developed translater, almost certainty some of the functionality and data will be lost in the translation ; or if they have the resources they can make extensive modifications (and possibly re-enter the data) for their own purposes.

It is interesting to note that the major emphasis in the development of widespread

GIS standards (until quite recently) has not been for similar types of users to produce coordinated datasets that use the same feature descriptions (among other factors of coordination). We are only beginning to see the development of sets of common standards for feature classification and database structure by groups with common

214 interests, agreeing on how to classify various features. The lack of standard feature class descriptions alone is a major stumbling block fo r database integration, no matter how well other levels o f GIS standards are addressed.

There are many reasons why more standards have not been developed—not the least of which is the relative newness of the technology, organizational independence, consulting firms wanting to maintain profitable one-of-a-kind designs, etc. But whatever the reason, from where we stand now it is apparent that these thousands of different databases, comprising investments of perhaps hundreds of millions of dollars in the U.S. alone, cannot be readily abandoned because new standards have been released. And from the hndings of the Large Scale Feature Classification Project, trying to shoehorn existing, aggregated classes of data into new “standard” classes is a losing proposition at best. It can be conjectured that if the new and upcoming standards are widely embraced, many datasets will simply need to be re-created entirely from scratch.

All of the foregoing discussion is to explain why feature classification standards arrived at before datasets are created are desirable, to promote easy data exchange and combination, and flexibility for a wide variety of uses. The existence of so many established databases with different feature groupings helps to explain why the concepts of the NSDI, the SDTS standard, and metadata standards—while useful efforts in and of themselves—do not go nearly far enough in addressing the need for common feature descriptors for different purposes, or different accuracies/scales of magnitude. These standards and concepts promote and emphasize the description and sharing o f data as it exists rather than promoting coordinated standards efforts for true compatibility. Even the best, “ultimate” metadata or spatial data transfer method can only describe the

215 characteristics of a [probably unique] dataset as it now exists and the mechanics of data

transfer, butthese methods do not address the more fundamental issue of how to

generate quality data according to a definitive set of standards.

In the continuing, slow emergence of specific GIS classification standards for

similar groups of users (e.g.. County Auditors), data-sharing problems can only continue

to multiply as the data itself multiples. In fact, the limitations due to classification alone

severely constrain what can be done with the data in its current form. The SDTS and

some metadata standards were discussed earlier in Chapter 3; in the following section the

National Spatial Data Infrastructure will be examined in terms of the uncertainties in

delineating feature types. After that a recently-announced vegetation classification

standard will be discussed as an example of emerging broader-based standards that serve

multiple purposes and organizations. The uncertainties involved in these classification

standards will be discussed in a general way, in order to underscore the larger problem of

uncertainty in map output specifications. Fundamental to performing an accurate and

repeatable classification is knowing, with a minimum of ambiguity, which features are to

be assigned to a given class.

8.2 The National Spatial Data Infrastructure

In 1995 the Mapping Science Committee of the National Research Council recommended the establishment of a National Spatial Data Infrastructure (NSDI)- The

NSDI is intended as a collaborative effort to create a widely available source of basic geographic data, emphasizing the most commonly-used data themes. The Committee’s report lays out a general plan for the NSDI, by recommending two types of data: foundation data and framework data. Foundation data are defined as “the minimal

216 directly observable or recordable data from which other spatial data are referenced and compiled,” [1] and include geodetic control, orthorectified imagery, and terrain

(elevation) data. The framework data originally comprised the three data themes of transportation, hydrology, and boundary elements; later the cadastral theme was added.

Eventually, all seven types of data came to be jointly known as Framework Data. The report recognizes that multiple frameworks at multiple scales may be necessary. The task of coordinating the NSDI fell on a newly-established office, the Federal Geographic Data

Committee under the U.S. Department of the Interior.

It is interesting to note a strong emphasis on data integration in the opening statements of the report:

“Spatial data are expensive to generate, maintain, and integrate with other data. No single federal, state, or local agency can effectively respond to all possible spatial data needs of their constituencies. Nor can a single level of accuracy, consistency, or currentness be reasonably applied to all data products or applications. With a common locational registry for spatial data of all kinds, data produced for one application can be integrated more readily with other data. Without this, data sharing and exchange are impended. Data sharing can minimize duplication, reduce long-term costs, and streamline analysis and decision making. Mechanisms to integrate and exchange digital spatial data are a fundamental component of the national spatial data infrastructure (NSDI).” [1, italics added]

The report further emphasized the importance of data quality and its role in data integration:

“Data quality is an important component of data sharing because people need to know the reliability of interpretations and decisions based on the data generated by one agency and used in an application by another organization. No single federal, state, or local agency can effectively respond to all the possible spatial data needs of their constituencies. Nor can a single level of accuracy, consistency, or currentness be easily applied to all data products or applications...it is true, however, that one data-producing agency must often depend on another data producer for source data, which points once again to the importance of sharing data and of integrating data from multiple sources.” [2, italics added] 217 Recent statements from the FGDC promote framework data as “data you can

trust—the best available data for an area, certified, standardized, and described according

to a common standard;” this data is intended to be used by many organizations for

“attaching their own geographic data.” [4] To date, 16 FGDC-endorsed standards

documents are at the final stage; 7 have completed public review; and another 10 are in

the draft or proposal stage. [5] In Section 8.3 one of the standards at the final stage will

be briefly evaluated: the Vegetation Classification Standard. This standard was chosen because it pertains the most closely to the classification and assessment of SAR/InS AR thematic maps, and aligns with the dominance of vegetation classifications among

SAR/InS AR classification studies.

8.3 Evaluation of Uncertainty in the Vegetation Classification Standard

In June 1997 the FGDC formally endorsed the Vegetation Classification Standard as presented in document FGDC-STD-005, submitted by the Vegetation Subcommittee.

It is intended to address the need for a federal standard for vegetation classification and reporting of vegetation statistics. The document “proposes a standard for terminology, core (or minimum) data and vegetation classification.” [6] The purpose of the national standard is to require that all federal classification efforts have core components that are uniform across all federal agencies, to enable data sharing and aggregation for all federal agencies. The classifications are to be used by federal agencies to inventory, map, and report on the United States’ vegetation resources.

In the document, vegetation is described as a collection of plants or plant communities. The classification emphasizes vegetative and floristic (flowers or flora)

218 Vegetation Standard for Class I, Subclass A, Group 1, Subgroup N

Class: Closed tree canopy (trees with overlapping crowns, forming 60-80% cover)

Subclass: Evergreen forest (generally >75% of total tree cover)

Group: Tropical or subtroptical broad-leaved evergreen rainforest. (broad-leaved evergreen trees, neither cold- nor drought- resistant)

Subgroup: Natural/Semi-natural

Formation: a. Lowland tropical or subtropical rainforest b. Submontane tropical or subtropical rainforest c. Montane tropical or subtropical rainforest d. Montane tropical or subtropical cloud forest e. Subalpine tropical or subtropical rainforest f. Temporarily flooded tropical or subtropical rainforest g. Semipermanently flooded tropical or subtropical rainforest h. Saturated tropical or subtropical evergreen rainforest i. Tidal tropical or subtropical rainforest j. Seasonally flooded tropical or subtropical rainforest

Table 8.2. Example of one class, subclass, group, subgroup, and formation of the National Vegetation Classification Standard. [6]

characteristics. The classification includes 7 classes; 21 subclasses; 77 groups; 2

subgroups; and over 300 formation descriptions [6]. An example of each of these

categories is shown in Table 8.2.

A guiding principle for applying the vegetation classification is that (according to

the standard) it must be repeatable, reproducible and consistent. The classification

methods are intended to be clear, precise, quantitative where possible, and based upon

objective criteria, so that the outcome is the same no matter who performs the classification.

219 It can be noted that the standard does not prescribe any given positional or class

accuracy, or any given testing method. Rather, it stipulates only that the procedure used

to ascertain position and class accuracy should be described, along with the results.

Although repeatability and consistency are among the stated goals, no definition of their

meaning is offered, or guidance on how these qualities may be attained or verified.

Although it is clear that some important and significant progress is being made in

the area of standards, there is still a long way to go. In the vegetation standard we see yet

another GIS product being developed, which is not guided by a full range of simple

standards that govern millions of other (non-GIS) products that are produced every day.

How does our methodology apply for determining the uncertainty of

classifications? To begin with (step one), the measurand is to be described as fully as

possible. In this case our measurand corresponds to a particular feature class, as defined

by a set of class descriptors according to V =^Xj); each part of the class description is

given by X,. The descriptions should be defined as thoroughly and completely as possible

and ambiguities resolved. Ambiguous descriptions should be minimized, since they yield

erratic results.

Given the current state of knowledge, the only definitive statement that can be offered on the uncertainty of current classification practice, is that the degree of ambiguity introduced in the classification process is largely unknown, because repeatability based on class definitions and other factors have not been tested or established. Perhaps a way could be devised to measure the level of ambiguity—which also happens to be the measure of uncertainty—in the classification process. A testing procedure might be formulated according to the methodology, not only for establishing

220 repeatability but also reproducibility. Although we cannot, given the current classification process, assign a mathematical model to it or strictly carry out the other steps of the methodology, the concepts and methods for determining repeatability and reproducibility can he applied by constructing a series of repetitive measurements. These measurements will then allow us to complete the steps of the methodology. An example of how such a test might be conducted is shown in Table 8.3.

The methodology for assessing uncertainty in map output requirements is summarized in

Table 8.4 according to the measurement uncertainties.

According to the reproducibility criteria, when the same dataset and classification standard are used, but different software, sampling methods, data processing procedures, and other variations occur, the classification results should be equivalent to within a given tolerance, when the testing methods (i.e., the sampling methods, data processing, etc.) are valid. It would be very interesting to test the level of uncertainty in vegetation classification following this scheme. Is the uncertainty 50% under certain conditions and for certain classes, but only 5% under others? How can these conditions be quantified?

When classification results are demonstrated to not be repeatable or reproducible within acceptable tolerances, then there are major flaws and either the sampling, processing, or classification scheme needs to be substantially changed.

It should be mentioned that further efforts to standardize vegetation classification are in process by the Vegetation Classification Panel of the Ecological Society of

America (EGA). Their draft document underscores the importance of classification standards, for “without a set of nationwide standards, data from different sources cannot be integrated, compared or evaluated.” [7] The FGDC standard is critiqued as focusing

221 APPLYING THE METHODOLOGY TO ESTABLISH REPEATABILITY AND REPRODUCIBILITY For feature classification guided by classification standards

1. First determine the repeatability and reproducibility of the ground truth classification, by observing the following steps.

2. Perform repeatability tests on the data being classified:

a. Select a set o f data and have the same person classify the same data several times, performing the end-to-end process each time. The same sampling and processing methods and classification criteria should be applied in each case, without looking at the earlier classifications (which could bias the results).

b. When the repeated classifications are finished, compare them with each other and with the truth dataset, through the use of a truth table or some other adequate means. It may be helpful for the analyst to attempt to explain any signihcant, inconsistent results.

c. From the comparisons, determine the measurement uncertainty, including the uncertainty of the truth dataset. Per class uncertainties as well as aggregate uncertainties should be indicated.

d. Repeat the whole process for several different scenes that represent a variety o f terrains, ecosystems, seasons, and other factors of importance. It is anticipated that results will highlight faulty methods and assumptions, as well as unreasonable/unattainable class distinctions in the standard.

3. Perform reproducibility tests on the data being classified:

a. Select a set o f data and have several different people in different labs each classify the same data using the same classification standard, but different sampling and processing methods.

b. Compare each o f the final classifications with each others and with the truth dataset, through the use o f a truth table or some other adequate means. It may be helpful for the analysts to attempt to explain any significant, inconsistent results.

c. Repeat c from part 2.

d. Repeat d from part 2.

4. Reporting

If the results are reasonably consistent, it may be possible to use these ‘calibrations’ or benchmarks to represent typical uncertainties in similar situations. For example, a statement such as the following might be entered on production maps that follow similar processes: “Based on a series o f benchmarks tests, the class ambiguities (error in class assignment) in this map have been generally shown to comply with a standard uncertainty of 5% in repeatability and 10% in reproducibility.”

Table 8.3. An example methodology for establishing repeatability and reproducibility for map classifica^on standards. Chapter 8: Overview of Uncertainty Analysis

Step 1. Step 2. Step 3. Step 4. Step 5. Step 6. Step 7. Define Math. Inputs Find Find Report Repeatab. measurand model K xiYs results & Reprod. 1. Establish unambiguous feature class • descriptions. 2. Classify data and perform • • R&R testing. 3. From R&R data determine the amount of uncertainty in classification.

Quantity determined

Quantity partly determined

Numerical value only

Table 8.4. Overview of uncertainty analysis in map classification output requirements. The methodology must be applied differently in this case because feature classes offer only nominal data. The above steps are: (1) define classes as unambiguously as possible, then (2) generate the classified scene using the methodology of Chapter 7, followed by repeatability and reproducibility (“R«feR”) testing. The test results will generate numerical comparison data against a truth dataset, according to the number and type of feature pixels matched. The methodology can then be applied to the comparison data.

223 on the physiognomic levels of vegetation classification (related to the structure and life form of a plant community). This work extends the FGDC standard to the classification of floristically-defined vegetation types. Some of the categories in the FGDC standard

“are only introduced conceptually and offer no details of nomenclature or methods for delimiting or describing them” [7]. The EC A effort follows on to extensive work classification definition performed by The Nature Conservancy. While intended to comply with and extend the FGDC standard, one of the Panel’s stated goals is to

“advance quality assurance of the data in the national vegetation classification,” and a guiding principle is that “methods developed and used to apply the classification must be repeatable and consistent” [7]. Further, the draft standard prescribes tagging all vegetation units with “confidence codes” that reflect the type of information and the level of analysis that was available for assigning a given class. The three available confidence codes are given in Table 8.5.

Classification confidence levels for associations 1 STRONG Classification is based on quantitative analysis of verifiable field data from occurrences that can be relocated. A sufficient number of plots in vegetation distribution by both region and type is given. 2 MODERATE Classification is based on quantitative analysis of a partial data set and/or a more qualitative assessment of sufficient quantity and quality, but may include limited samples or geographic range. 3 WEAK Classification is based on anecdotal information or descriptions not accompanied by field plot data. Local experts have often identified these types, but it is unknown whether they meet national standards.

Table 8.5. Classification confidence levels for associations. These levels are assigned to each association based on the kind and amount of information that was used to classify it. [after 7]

224 However, the draft standard makes no mention of positional accuracies of the data, only classification confidence according to the three levels. But are these levels sufficient to describe the quality of the classification decisions? Perhaps, or perhaps not—further testing and verification is needed to see if this simple tri-level assessment is adequate. However, even this is a step beyond most prior efforts that only assigned a feature to one thematic class or another, with no clarification of the relative certainty.

Other statements in the draft standard underscore the importance of repeatability and reliability in classification, affirming earlier points:

“Vegetation classification attempts to identify discrete, repeatable classes of relatively homogeneous vegetation communities about which reliable statements can be made.” [8, italics added]

The quote implies that without repeatability and reliability, a classification effort is not truly successful. This is because a primary goal in vegetation classification (and other types of classifications) is to record the vegetation classes over time to determine how vegetation is changing: without repeatability, that goal cannot be reached.

8.4 Summary

It is clear that emerging classification standards are beginning to point the way towards unambiguous class definitions and improved assessment of the uncertainty of a

GIS database; however the methodologies for assessment still tend to be overly general and more qualitative than quantitative. By contrast, our methodology prescribes how to determine repeatability and reproducibility by applying simple concepts followed in many other fields. Since classified (thematic) maps are defined as a product in their own

225 right, we assert that standards and testing methods that apply to many other products,

should be adapted and applied to mapping products that are generated in a production

mode— including SAR/InS AR thematic maps.

8.5 Note and References

Note

^ FGDC Cooperative Agreements Program (CAP): “Large Scale Feature Classification,” an OGRIP project involving team members from the Center for Mapping, the Ohio Department of Natural Resources, The Ohio Environmental Protection Agency, and others, ~ 1997.

References

[1] Mapping Science Committee, Board on Earth Sciences and Resources, Commission on Geosciences, Environment, and Resources, National Research Council, A Data Foundation for the National Spatial Data Infrastructure, Washington, D C.: National Academy Press, 1995, p. 1.

[2] Ibid.. p. 10.

[3] Sperry, Roger, and Donahue, Arnold, “Geographic Information for the 21®^ Century,” GEOWorld, October 1999, pp. 34-35.

[4] Federal Geographic Data Committee, “Overviews: What the Framework Approach Involves,” URL http://fgdc.er.usgs.gov/framework/overview.html. accessed 7/26/01.

[5] Federal Geographic Data Committee, “Status of FGDC Standards as of June 15, 2001,” URL http://www.fgdc.gov/standards/status/textstatus.html. accessed 7/31/01-

[6] Vegetation Subcommittee, Federal Geographic Data Cormnittee, FGDC-STD-005: Vegetation Classification Standard. June 1997.

[7] Vegetation Classification Panel of the Ecological Society of America, Review Draft V. 6.0, July 2000, “An Initiative for a Standardized Classification of Vegetation in the United States,” URL http://esa.sdsc.edu/initv60.htm. accessed 7/30/01.

[8] Kimmins, J.P., Forest Ecology: A Foundation for Sustainable Management. Second Edition, Upper Saddle River, New Jersey: Prentice-Hall, 1997.

226 CHAPTER 9

SUMMARY AND CONCLUSION

This dissertation examined the suitability of SAR/InS AR data for thematic map

series in a production environment. In order to address this question, it became necessary

to investigate a broad range of issues, both to shed light on the current methodology and

to suggest potential avenues for improvement. It became clear early on that something

was amiss in the way in which SAR/InSAR classification was practiced. However,

establishing a firm footing on the “miry clay” for understanding and evaluating these

problems was a challenge. While remarkable strides have been made from an engineering

perspective, practitioners in remote sensing have seriously lagged behind sensor

engineers in error modeling and quality assurance.

9.1 General Summary

Chapter 1 introduced the basic topic and problem to be investigated, while

Chapters 2,3, and 4 laid groundwork that allowed problems in the current paradigm to be

exposed, as well as to establish a new methodology based on sound scientific and

engineering principles. The new methodology was then applied to the four basic processes involved in SAR/InS AR thematic classification, as shown in Figure 9.1 (figure copied from Chapter 1).

227 DIgltal/hardcopy Terrain/ C Antenna Image Classifi­ landscape and cation Models/ maps: topography, Processor I = > Algorithms landscape classes

& ' O UtDU t :lassifiec raw signal data data landscapes

Figure 9.1 Four basic processes in generating landscape maps from SAR.

In Process 1: Sensor and Scene Interactions (Chapter 5) the general measurement

standards were applied, using standard error propagation, to determine the largest

contributors to uncertainty in the measurements at this level. The results showed that

uncertainty in the internal sensor processes was well understood and well-modeled, and

that through careful antenna/sensor design most of the internal problems could be

minimized. However, the results also showed the lack of a comprehensive model to

represent the backscatter’s interaction with the terrain, in which speckle, surface

roughness, local incidence angle, and the dielectric constant introduces a very high

degree of uncertainty that dwarfs that of the antenna and sensor electronics. These combined factors introduce what appear as random variations in the signal return, for

which the backscatter value can vary by up to 10,000 times the signal strength from pixel to pixel (on a standard numerical scale).

The microwave signal is quite sensitive to variations in the terrain, such that small changes in the landscape at the wavelength scale can introduce major changes in the signal response. So, while on the one hand the relatively long wavelength of a microwave signal is able to easily penetrate atmospheric conditions, on the other hand, that ability comes with a price. In addition, the fact that the signal is coherent carries the 228 benefit of being able to use the phase to infer elevations, but also means that constructive and destructive inference, as well as layover and shadowing due to Doppler and time dependencies, can make results difficult to interpret. After all of the factors in the first main process are considered, the highly variable terrain conditions remain as the greatest hindrance to reliable classification results in the production of thematic maps.

In Process 2: Antenna/Receiver and Processor Functions (Chapter 6) the uncertainties again are well-modeled and well understood from a systems engineering perspective. Many of the system factors can be corrected to a large degree. However, it was demonstrated that the main contributions to the uncertainty, as in Process 1, again depend mostly on local terrain conditions (and their influence on radiometric and geometric measurements, for example).

In Process 3: Image Classification Models and Algorithms (Chapter 7), many of the problems with the classification methodology as it is currently practiced were examined. The new methodology based on general measurement standards was adapted to strengthen this area, through the use of uncertainty modeling and repeatability and reproducibility testing. This type of testing is expected to associate a stronger measure of confidence in terms of the application of a given model or algorithm, by demonstrating the effective range of normal variabilities (i.e., uncertainties).

Finally, in Process 4: Map Output Classification (Chapter 8), the new methodology uses repeatability and reproducibility criteria through repeated tests to determine the uncertainty that is associated with the class definitions and how they are applied.

229 In this effort, a way to quantify the uncertainties associated with each of the 4 processes involved in SAR/InSAR classification was identified. However, experimental testing is needed to confirm the extent to which the methodology can be applied in practice, not just in concept as was done here. Appendix B suggests potential avenues to explore which may yield more definitive results than have past efforts in feature classification.

9.2 Applying the New Methodology to Thematic Map Classification of SAR/InSAR D ata

Putting the new methodology into practice can be summarized by three basic steps. For convenience the steps are grouped here as: (a) defining the intermediate and final measurands and the associated models for the particular purpose at hand; (b) determining the experimental mean and standard deviation for the model input parameters; and (c) determining the uncertainty and combined uncertainty of the measurands. In earlier chapters, we examined how to apply the new methodology to individual models that are part of the whole process. Here we examine how to combine the individual models into an overall total model. Each of the three steps is described below. The overall process is depicted in Figure 9.2.

In the first step (a), the intermediate (F,) and final (Ytot ) measurands are defined, and the corresponding mathematical models are stated to the extent possible and desirable

(see related discussions in Chapters 5-8). This is a key part of the thematic map specification/design process, and should include the limits of acceptable errors in the final measurement results. When the measurand is calculated, a model is used to quantify the relationship and interactions among the significant input parameters. For our purposes, we have extended the definition of measurand to the way in which a quantity is 230 Define the interim and final measurands needed for production purposes, their boundary conditions, and acceptable limits for the combined uncertainties. State the associated models.

Obtain the mean and standard deviation for the input parameters of each model, according to repeatability and reproducibility conditions.

Calculate the uncertainty and combined uncertainty for each measurand using the new methodology

Figure 9.2. Summary of steps for determining the uncertainty and combined uncertainty of measurement for each interim and final measurand.

represented by the particular measurement method or model used. It is also important to describe the limiting (i.e., boundary) conditions under which a measurand is valid, according to its model.

For example, let us define one particular measurand as the average roughness of the terrain, as represented by height variations of a given ground area sampled in a specified way. Suppose that the measurand is determined by a series of ground measurements /?, using a ruler. The ruler is placed in a position that is perpendicular to the ground at each measured point, and the height is read from the scale of the ruler. A limiting condition is that the roughness measurement is only valid for the particular terrain surveyed, and it assumes that the roughness is, in general, spread uniformly across

231 the area sampled. If is the average roughness and n is the number of independent measurements, then:

Ra = 1/n E R i, where i varies from 1 to n.

The experimental variance of the observations is given by:

[l/(n-l)]E(/?A, - RaŸ.

The positive square root iurm gives the experimental standard deviation, which is an expression of the variability of the measurements. These two measurements, Ra and f^RAiy now characterize the measurand (i.e., the terrain roughness as defined here).

This leads us into part (b) on determining the experimental mean and standard deviation for the primary input parameters. (The mean and standard deviation may also be taken from established values, when available.) In the above case of terrain roughness, only a single input parameter—the roughness height—was used. However, most of the measurands in the four processing steps of Figure 9.1 are functions of several input parameters. (For example, the point radar equation discussed in Chapter 5 uses five input parameters.) Thus, for every input parameter the mean and standard deviation need to be obtained. These values are most reliable when the repeatability and reproducibility conditions discussed in Chapter 4 are applied.

In part (c) the total combined uncertainty is expressed according to a mathematical function /a s used to represent a measurand F,- in terms of:

^ = f i ( ^ i j > ^ i u + i ) > ■ ■ ■ ’ )

232 where i is the particular function (i.e., model) under study ranging from functions 1 to M, and j is the number of input parameters in each i function that ranges from 1 to iV input parameters. The combined total of Yj functions can be expressed as:

M AT

1=1 / = i

Y tot = F/ + F2 + • • •+ Y m , or

The models yj- presented throughout this dissertation demonstrate how to apply these equations, in which yj represents a model such as the radar point equation. By using the outputs of a set of models arrived at earlier in the processes of Figure 9.1 (such as the backscatter trand the returned power Pi), and using them as input parameters to later models in the process, the total combined uncertainty of a final output quantity may be obtained. Examples of final output quantities are the total uncertainty of an absolute position (x,y,z) of an image pixel relative to the ground, the radiometric value of a pixel, and the accuracy of a given feature classification.

To obtain the total combined uncertainty ^Ictot , we apply propagation of error^ according to the new methodology and obtain:

M s 2 2 1=1 j= \ 1 J

233 Where the partial derivatives ^/Sxij are equal to ^/ÔKij evaluated atXÿ=Xÿ. Thus

the total uncertainty of a final measurement result tells us the extent that the final

outcome is reliable, to within a reasonable probability and set of limiting circumstances.

It should be noted that significant gaps exist in the available models at each stage,

which limit the application of the methodology. This highlights the need for improved

models in each of the four processes, described below.

In Process 1: Sensor and Scene Interactions (Chapter 5), only fragmented models

are available for parts of the process. Comprehensive models that take into account the main influence parameters and yield the values for amplitude, phase, and polarization

(the primary measurands) do not exist. Furthermore, absolute means for signal calibration are very limited and do not allow the measurements to be traced back to absolute quantities that do not change—another requirement of the methodology.

The values for amplitude, phase, and polarization from Process 1 then become the main inputs for Process 2: Antenna/Receiver and Processor Functions (Chapter 6). The early signal processing steps are well-modeled, progressing successively through a chain of linked operations that correct for antenna/receiver errors. However, the models become fragmented when linked to radiometric and geometric corrections involving variations in the landscape. At this stage, additional inputs relate the pixel locations to the corresponding absolute locations on the ground. The main output from process 2 then becomes the corrected amplitude, phase, polarization, and the corrected pixel locations in

X, y, and z. However, there are no comprehensive models that allow these values to be traced from the input to the output while accounting for the main influence parameters.

234 Next, a significant disconnect is noted as we move into Process 3: Classification

Models and Algorithms (Chapter 7). At the point that the output from process 2 becomes the input to process 3, in general all of the prior error models are ignored and the data are simply accepted at face value. The input data are essentially assumed to be without error within Process 3. The input data are typically processed further into intermediate products, such as a magnitude image, interferogram, and polarimetric covariance matrix, without considering the effect of error propagation on the intermediate products. Further, the uncertainty that is introduced by the particular classification model or algorithm and its additional and/or derived parameters is not considered. The primary output of process

3 is the assignment of feature class labels to regions of the SAR/InSAR data, grouped according to similar spatial or spectral characteristics. Because the output are nominal

(class) data, new methods are needed to apply the methodology directly (rather than indirectly, as described in Chapter 7 and 8).

Typically, the output of process 3 does not correspond to the input needed for

Process 4: Map Output Requirements. In process 4 an organization determines the thematic map classes it needs to perform certain functions. For production purposes, it is important to clearly specify the classification rules and their limits for each feature class.

In this way, features that differ to some degree can be assigned unambiguously to the proper feature class, thereby promoting repeatability and reproducibility of results. The fact that the output classes of process 3 do not match the input classes (i.e., class specifications) of process 4 creates a number of problems in applying the methodology.

Once again, we cannot carry uncertainties introduced in process 3 into process 4, nor can we evaluate the true effectiveness and accuracy of the methods used in process 3.

235 Therefore, in order to apply the methodology to the end-to-end process of generating thematic maps from SAR/InSAR data, a number of improvements are needed to produce the reformulated models shown in Figure 9.3. The combined processes will yield the total combined uncertainties ..., r), in which uncertainties in prior stages are included in the following stages according to standard error propagation techniques.

By following this process, reliable, repeatable, and reproducible thematic maps of known variances could be generated in a production mode, resulting in sound decision-making.

236 Figure 9.3. (See following page.) The reformulated end-to-end method for thematic maps in production mode is shown. The rectangles correspond to the collective models applied as part of each process 1, 2, 3, or 4; the parallelograms correspond to the primary input and output data; and the diamonds correspond to decisions. The output of one model or process becomes the input of the next. Propagation of uncertainty is taken into account throughout. For each model or process, the three basic steps shown in Figure 9.2 can be applied to determine the uncertainty. The combined uncertainty can be represented as: fi[ = nii,j, k) for the output of process I ; fio = HG, m, n) for the output of processes 1 and 2; ^r= tKo, p, q) for the output of processees 1, 2, and 3; and P t o t a l = p(r, s, t) for the output of processes 1-4. Ultimately, processes 3 and 4 can be combined once the final class descriptors are determined.

237 Signal sent and returned from landscape + fii

alibration measures tied. Process 1: to common standards+/ty Sensor & Scene Interactions + Hk

etumed amplitude, phase^ polarization + Ht

Process 2: X,Y, Z pixel positional Antenna/Receiver and corrections + ^ Processor Functions ^he + M. "class descriptors' determine the sensor(s) anc sdata needed to distin^ brrected amplitude, phase/ iish ther larization, X , Y , Z + /i,,

Process 3: Classifîcation Models ^ y6ne-of-a-kind, informal and Algorithms / class descriptors + jUp + M;

Pixels labeled according ''t o method of Process 3+u,/

Process 4: Formal, unambiguous Map Output Requirements Class descriptors + fi.

liable

Figure 9.3 Reformulated model for thematic maps in production mode. 238 Two points can be added here to the total methodology. The first is that the choice of a particular type of data (such as SAR/InSAR, lidar, LandSAT, ground truth, etc., or some combination) should be based on their ability to yield an uncertainty that fits within the required specifications and budget. Further, in order to reduce production costs, it may be necessary to revise the measurand descriptions to better fit the available data, such as grouping together two classes that cannot be distinguished in the available sources, or accepting a greater uncertainty—if it can be done without overly compromising the end uses. The second point is that improved methods are needed to evaluate the accuracy of classified maps, in a way that allows areal or linear features of the test dataset to be compared with the truth dataset, and not just the total number of pixels by class. This remains an unsolved area for further research.

9.3 Conclusion

Based on the information available at the time of this study, the conclusion is that SAR/InSAR data are probably not well suited for large-scale thematic mapping in a production environment, except in very limited cases. This conclusion was arrived at through a strong understanding of the signal’s strong dependence on physical interactions with the terrain at the wavelength (cm) scale. Several factors combine to make interpretation of the signal particularly difficult: the large variance in natural conditions; the inability of current technology to distinguish the ambiguous signal effects; and constructive and destructive patterns of interference. Even slight changes in landscape conditions can produce large changes in the signal response, while signal returns are often ambiguous in nature. Nonetheless, it is anticipated that testing the levels of uncertainty at each of the four processes using the methodology described here,

239 is likely to yield conclusive evidence as to the total range of uncertainties that can be expected in SAR/InSAR thematic map production.

Perhaps the most promising use of SAR/InSAR is for production of elevation data, so long as the conditions for their production are well-controlled and the related uncertainties are understood and clearly stated. Uncertainty modeling of elevation data reflects a relatively high confidence in range (in contrast with backscatter). This is because the uncertainty jXp in local terrain elevation is far less than the slant range p to the sensor (especially in the satellite case), giving a very low ratio for Pp/p and hence a relatively low value of uncertainty. Still, some aspects of InSAR-derived elevations are troubling, particular in areas of steep terrain. These include but are not limited to layover, shadowing, and distortion in range.

9.4 Questions for Future Research

A number of potential future research directions were identified as a result of this investigation. Some are discussed earlier in the chapter; others in Appendix B; and yet others are listed below. Additional background and rationale for each question may be found throughout this dissertation. The ones considered most important are indicated with an asterisk (*).

1. What is the uncertainty associated with change and variations across a

landscape, and from scene to scene?*

2. What is the true measure of uncertainty related to speckle?* What is the validity

of averaging or multilooking to reduce the effect?*

240 3. Can a comprehensive model be developed for backscatter, incorporating all of

the major contributions?* And if so, how might this be useful in feature

classification?

4. What degree of uncertainty is introduced in the sampling process of a truth

dataset?

5. What degree of uncertainty is associated with changing numerical data and

models into nominal feature classes?

6. What is the degree of uncertainty and bias associated with different kinds of

truth datasets?

7. A representation model is only an approximation of the true value of the quantity

that is being measured. How can the degree of uncertainty be determined for

different types of models?

8. Can a method be developed to compare measurement data with truth data using

numerical data, before it is converted into nominal categories?

9. Can the concepts of Certified Reference Materials and uniform calibration

standards be incorporated in thematic mapping?

10. Can a series of tests be devised for remote sensing data, that allows features with

distinguishing characteristics in the data to be determined without ambiguity?

9.5 Note

1 When input parameters are correlated, the total combined variance HcTot is given by:

M N dx,j dXy

where k and I correspond to correlated quantities in the expression fi.

241 APPENDIX A

PARADIGMS ADAPTED FOR SAR/INSAR CLASSIFICATION

Many experiments in SAR/InSAR classification are reported in the scientific

literature. Upon further study, it can be observed that the framework guiding these

activities is adapted from at least three paradigms: 1) cartographic map generalization, 2)

remote sensing classification, and 3) machine learning. Cartographic map generalization offers a general theoretical framework for defining and specifying the experimental parameters. Remote sensing provides concepts and techniques for assessing the accuracy of the final results. Finally, machine learning offers insights as to the importance of appropriate representations in models and algorithms. It can be observed that these concepts have been adapted, blended, and applied to the task of SAR/InSAR image classification.

The three paradigms are discussed in this appendix and summarized in Chapter 2.

Critiques are offered at the end of each section. The critiques center on how, in the process of the “informal adaptation” of these paradigms and their combination into a new guiding paradigm, some of the original guiding concepts may have been overlooked. The apparent outcome is less rigor (and therefore less reliable results) in SAR/InSAR

242 classification than might otherwise have occurred, if the original principles had been followed with greater care. This suggests that applying some of the “missing” principles in an improved theory and methodology could result in improved practices.

Another related paradigm that influences the others is discussed in section A.4: the practice of classification according to set theory. The section discusses both general and specific principles as applied to the topic under study. The way in which classification is applied can greatly influence the accuracy and comparison of results from study to study, and points to the desirability of using standard feature class descriptions in scientific studies (discussed in Chapter 8).

A.I. Paradigm 1: Cartographic Map Generalization

Phenomena about the Earth occur in enormous complexity. The cartographic method selects, simplifies and processes certain aspects of the environment into effective portrayals, which are typically annotated maps. Whereas the real Earth may exhibit many thousands of complexities on different levels, a map may only present a few of them in simplified form. Processing of selected geographic features ranges from simple statistical measures such as averages and ratios, to more complex operations such as simplification and generalization—together called generalization.

Four basic categories of geographic phenomena exist [9]: zero-dimensional point data (0-D) that represent single places or positions such as control points; one­ dimensional linear data (I-D) that represent linear features such as roads or streams; two- dimensional areal data (2-D) for features such as lakes or forest; and three-dimensional volumetric data (3-D) such as the amount of soil in a given region to be moved in a construction project. To this we can add a fourth dimension of time (4-D), in order to

243 depict landscape changes over time due to natural and manmade effects. Most of the studies in SAR/InSAR are heavily dominated by 2-D (flat picture) generalizations, perhaps at least partly due to a carryover from reliance on paper maps—even though

SAR/InSAR data can also be used for 3-D (interferometry and stereo to reconstruct surfaces, and implied volume) and 4-D representations (to depict velocity and movement over time, as in animation). Since most S AR studies center on 2-D data, this will be our main emphasis as well.

In Elements of Cartography 191. authors Robinson, Sale, Morrison, and Muehrcke describe the elements of cartographic generalization; simplification, classification, generalization, and induction:

• Simplification: determining the important characteristics of the data,

eliminating unwanted detail, and retaining and possibly exaggerating the

important characteristics. Which specific data to retain is one of the

cartographer’s primary problems. Exaggeration makes small features visible

and/or emphasized at the viewing scale. As map scales get smaller, fewer

features can be displayed. (Also known as specifying the desired features in

image classification, as a subset of the very complex image.)

• Classification: ordering and grouping (clustering) data to bring relative

simplicity from a complexity of differences. Common classification processes

include grouping similar qualitative phenomena such as vegetation into

categories, e.g., cropland, grassland, and forest. (Also known as segmentation

in typical image processing functions.)

244 • Symbolization: assigning graphic codes to the grouped data, establishing

significance, and assigning relative position and prominence. The graphic

coding makes the generalization visible, and links the meaning of a feature to

its representation: such as coding water as blue, agriculture as light green, and

forest as dark green. When selecting the data to be represented, the intended

prominence of certain features should be clear. (Also known as labeling or

representation in other fields of study.)

• Induction: applying logical geographic inference to extend the map content,

such as inferring area properties from a collection of point samples, or

drawing contour lines through an area that is partially occluded by forest.

Once depicted in a map, new relationships among features can become

evident. (Also known as sampling in remote sensing studies.)

The authors describe the factors that influence how each of the above four generalization processes are performed. They are subject to the controls of cartographic generalization: the objective, scale, graphic limits, and quality of data:

• Objective: the purpose of the map has a great deal to do with how it is

designed. The kind of audience to which it is aimed is another.

• Scale: the ratio of the map to the Earth. A smaller scale requires greater

generalization. The amount of detail needs to be commensurate with the

scale(s) at which it will be viewed. Table A.l [10] relates scale to raster

spatial resolution, based on a 0.1 mm printed pixel.

• Graphic limits: important controls on the generalization process. Physical

limits are imposed by the equipment, materials, and skills of the technician. 245 Physiological and psychological limits are due to the map user’s perceptions

and reactions to the graphic elements as presented.

Ground Scale Pixel size (m) 1 5,000 0.5 1 10,000 1 1 50,000 5 1 250,000 25 1 500,000 50 1 5,000,000 500 1 10,000,000 1000 1 50,000,000 5000

Table A.I. Suggested maximum scales of photographic products as a function of effective ground pixel size. [10] The image pixel size is 0.1 mm in this case.

• Quality of data: the reliability and precision of the various kinds of data being

mapped. It is important that the data and their presentation do not convey a

greater impression of completeness and reliability than warranted.

Several years later McMaster and Shea [11] look at the same problem of generalization, but update the concepts to reflect new developments in computer technology. They argue that generalization is fundamentally connected to scale, and that computer technology has not solved the problem of generalization but has only heightened the need to generalize (given the ever-greater volumes of data) and to

246 understand the generalization process better. Their assessment of digital and raster-based

generalization is summarized below, and provides a partial theoretical framework for

SAR image classification.

McMaster and Shea’s digital generalization model is comprised of three aspects:

(1) why to generalize, (2) when to generalize, and (3) how to generalize with appropriate

spatial and attribute transformations. Before geographical features and attributes can

undergo generalization through the use of operators, decisions must be made as to which objects/attributes should be included in the generalized map, known as the selection process. Once these are selected, spatial or attribute transformations are applied using operators such as those shown in Figure A.I. The particular operators listed under each bold category are intended as examples and are not exhaustive.

Structural generalization Categorical generalization simple structural reduction merging (of categories) resampling aggregation (of cells) non-weighted Numerical generalization category-weighted low-pass filters neighborhood-weighted high-pass filters attribute change compass gradient masks vegetation indices

Numerical categorization minimum-distance to means parallelepiped maximum likelihood

Figure A.I. McMaster and Shea’s Raster Generalization Operators [11]

247 Most of the associated work in raster generalization is found in the remote sensing literature (see Section 2). Remote sensing data is typically represented as numerical data in a raster format, wherein each pixel, or cell, within a 2-D matrix represent a block area on the surface of the ground (such as 30 meters).

It is interesting to note that the SAR/InSAR image classification literature focuses largely on how to generalize (i.e. classify), but little attention is given to why and when.

That is, the focus is on the algorithmic mechanism by which classifîcation occurs.

However, it is clear that in order to have accurate and useful maps, basic map design concepts require that we: (1) know what are the important features to classify, (2) define the purpose for which the final map is to be used, to aid in structuring the classes, and (3) have a clear understanding or definition of the capture conditions: i.e. how is a certain feature or feature type dehned (e.g., how sparse or dense do the trees in mixed forest/glassland need to be, in order to be classified as one or the other)?

It can be observed from the literature that the principles of the cartographic map generalization paradigm, as applied to SAR/InSAR classification, tend to be weakly applied in the following areas:

• Explicit statements are not usually offered as to what are the important

features to classify, the intended feature prominence, and why. Rather, a

study tends to choose classes as a matter of convenience, according to the

dominant characteristics of a given scene. There is often a priori selection of

a particular scene to align with the desired features sought.

248 • Features/classes are not clearly and explicitly defined (capture conditions).

What one operator considers to be forest may differ considerably from that of

the next operator.

• Explicit statements are generally not offered in the scientific studies as to what

scale is needed in the end product. Rather, the studies tend to be dataset-

driven: from this dataset we can capture these characteristics. “Let’s see what

this sensor is capable of doing,” rather than, “What are the specifications for

the job that needs to be done, and to what degree can this sensor meet those

specifications?”

• Physical, physiological, and psychological limits of the sensor, hardware,

software, and technicians and users, tend to not be taken into account. Rather,

the main emphasis is the novelty of the approach as demonstrated on a single

scene, regardless of the amount of effort it takes to get the results.

• The reliability, precision and completeness of the mapped data are not

accurately represented, because the assessment omits a great deal of

information about the sensor, processing, algorithmic, classification, and other

types of error, beyond a simple “truth table” on a single scene.

Part of the critique reflects the fact that the studies are designed to be exploratory, new, and cutting-edge since they are being written for research journals. But in this process, the reports overlook the need to study the limits of consistency and the reliability of their particular approach under somewhat different conditions—which is essential information in moving the technology from one-of-a-kind research studies into a production mode.

249 A.2 Paradigm!: Remote Sensing Image Classification

According to L.S. Lindenlaub, remote sensing is:

“...the science and art of acquiring information about material objects from measurements made at a distance—measurements made without coming into physical contact with the materials of interest. These measurements are possible because instmments can be designed to measure spectral, spatial and/or temporal variations in field strength. To complete the remote sensing process the data must be analyzed: such analysis may be carried out using image interpretation techniques, numerical analysis techniques or a combination of the two.”

In this section the paradigm of remote sensing image classification (as it refers to multispectral images) is discussed. It is important to our subject because many

SAR/InSAR image classification algorithms and assessment techniques are adapted from this paradigm. Typically the sensor is passive and relies on reflected EM energy at several wavelengths as reflected, refracted, or emitted from the Earth’s surface due to the sun’s energy.

Schowengerdt [12] discusses the use of spectral signatures of surface materials, such as vegetation, soil, and rock. The motivation of multispectral remote sensing is that different types of materials can often be distinguished on the basis of differences in spectral signatures. But recognition is made difficult by many factors, including

• Natural variability of a given material

• Coarse spectral range of many remote-sensing systems

• Effect of atmosphere on signatures

Therefore, while the goal is to apply different labels to different materials, the spectral reflectance data (“signatures”) differ according to the sample and the environment in which they are measured. Vegetation is especially variable due to continuing changes in

250 growth stage, plant health, and moisture content. Because of the changing environment, multispectral image analysis relies on comparison of relative signatures of materials for single images, rather than absolute signatures.

According to Richards [10], classification is a method for attaching labels to pixels according to their spectral character (e.g. signatures). This labeling is carried out by a computer, by training it beforehand to recognize pixels with spectral similarities.

Typically, a classification yields two types of outputs. The first is a thematic map of classes, in which different colors and/or patterns are assigned to groups of pixels corresponding to similar types of areal features. The second output is a table that summarizes the number of pixels in the image that were assigned to each class. The number of pixels corresponds to the number of acres on the ground, giving an assessment of the coverage by feature type.

The typical methods for classification of remotely sensed images fall into two major categories, described below; supervised and unsupervised classification.

A.2.1 Supervised Classification Techniques

Supervised classification [10] is the procedure most often used for quantitative analysis of remote sensing image data. It relies on suitable algorithms to label the pixels in an image as representing particular ground cover types, or classes. Regardless of the particular method chosen, the essential practical steps are:

1. Decide the set of ground cover types into which the image is to be segmented.

2. Choose representative or prototype pixels from each of the desired set of

classes. These pixels are said to form training data.

251 3. Use the training data to estimate the parameters of the particular classiOer

algorithm to be used.

4. Using the trained classifier, label or classify every pixel in the image into one

of the desired ground cover types (information classes).

5. Produce tabular summaries or thematic (class) maps which summarize the

results of the classification.

The analyst directs (“supervises”) the classification by selecting representative

areas on the image, and verifying the type of feature with other sources such as ground

survey, aerial photographs, and maps. These targeted areas, covering perhaps l%-5% of

the image, are the training data used to develop spectral signatures for classes of interest.

Signatures that are generated from the training data vary, depending on the type of

classifier used. The most common method used is maximum likelihood classification

whereby the signatures are obtained using both class mean vectors and covariance matrices. Often a threshold or limit is applied so that poorly represented pixels will not be classified.

A.2.2 Unsupervised Classification

Here we follow Schott’s discussion [2] on unsupervised classification. In some cases, it is useful to have the computer determine which pixels have similar characteristics (e.g. spectra), instead of trying to force the pixels into a class based on our perception of class similarities. This can be done using an unsupervised classifier. The simplest and most commonly used method is the t-means approach, requiring the user

252 only to specify the number of classes (k) in the image. The algorithm then attempts to

find the mean vector and covariance matrix for each class. Each pixel is assigned to a class based on how close (minimum distance to the mean) it is to the mean vectors.

The resulting classes indicate the natural spectral clusters in the data. They may or may not correspond to the desired land cover or materials classes. Because of this limitation, unsupervised classification is often used as a preprocessor for other algorithms. Although not as much effort is required in unsupervised classification, the results tend to be not as accurate as in supervised classification. For this reason supervised classification is the dominant method used today.

A.2.3 Accuracv Assessment

After classification it is important to assess the accuracy of the results, in order to assign a degree of confidence to the results. The “goodness” of the classifier is often evaluated by a confusion matrix, which compares the number of pixels in each class as output by the classifier, against the true number of pixels in each class. “Truth” is determined by field work, photographs, and other information sources. A sample of pixels are selected from the thematic map, typically recommended to follow random stratified sampling by class. Then their labels are checked against classes from reference data, optimally through site visits. From this a percentage of pixels in each class that are labeled correctly and incorrectly can be identified. The results are given in a table known as a confusion matrix (also termed “error matrix”) as depicted in Table A.2. The diagonal of the table represents correct matches; the columns are errors of omission

(pixels were not recognized as belonging to the target class); and the rows are errors of

253 commission (pixels were incorrectly labeled as belonging to another class). The overall classification accuracy (such as 84%) is obtained by averaging the percent of correct classifications, or the average can be weighted by the relative areas of classes.

When a study of error (e.g. error matrix) is well designed and carried out according to standard remote sensing practices, tlie result is one measure of how well a particular classification method performed on a single dataset. However, it does little to aid the task of evaluating different classifiers via cross-comparisons [13] or addressing the problems identified in Section A.2.

Ground truth classes Total ABC Thematic A 35 (70%) 2 (5%) 2 (4%) 39 Map B 10 (20%) 37 (93%) 3 (7%) 50 Classes C 5 (10%) 1 (2%) 41 (89%) 47 Number of ground truth pixels 50 40 46

Figure A.2. Confusion (error) matrix for assessing classification accuracy [1]

The issue of obtaining unbiased ground reference information should not be overlooked as the “truth” dataset. Jensen discusses the following issues related to training, testing, and sampling [13].

Use o f training vs. test reference information

It is not uncommon for error evaluation to be based only on training pixels used to train the classification algorithm. However, the locations of these training sites are usually not random but are biased by the analyst’s a priori knowledge of where certain

254 land-cover types exists. This introduces a bias, since the classification accuracies for

training pixels are generally higher than for the rest of the map. It is much better to locate

reference test pixels in the study area. These sites are not used in the training of the classification algorithm and are therefore unbiased.

Total number o f samples collected by category

The optimal number of pixels to reference on the ground and use to assess the accuracy of individual categories in the remote sensing classification map can be difficult to determine. Due to the large number of pixels in remotely sensed data, traditional sampling methods do not apply. Jensen suggests that a good rule of thumb is to collect at least 50 samples for each land-cover class in the error matrix, and if the area is large up to

100 samples per class.

Sampling scheme

Normally the method of choice is to utilize a stratified random sample to collect the appropriate number of samples per category. Landscapes often change rapidly.

Therefore, both the training and reference information should be collected as close to the date of image acquisition as possible.

Applying appropriate descriptive and multivariate statistics

Discrete multivariate techniques have been used to statistically evaluate the accuracy of remote-sensing-derived classification maps and error matrices since 1983 and are now widely adopted. Data are binomially or multinominally distributed. Statistics based on normal distributions do not apply.

The need for independent test and training data in assessing the quality of the classifier is commented on by Mitchell [14]:

255 “The quality of the classifier should be evaluated prior to its use. First, the training data can be run through the classifier and a confusion matrix produced. This is a dependent data set and can yield inflated performance estimates if the training data are not robust. It is often supplemented by also running the classifier on data from training sites that were not included in the data used to train the classifier. This independent data set is a good check on the robustness of the classifier.”

Geometric and radiometric enhancement are often performed on images as a precursor to classification. Geometric enhancement involves operations such as smoothing noise present in data, enhancing and highlighting edges, and detecting and enhancing lines. Radiometric enhancement is concerned with altering the contrast range of the pixels in an image to make certain features more readily visible. While these may improve image quality, they also introduce elements of image variability by altering image characteristics, making repeatable classification more difficult. [10]

Other Guidelines

Some further guidelines in using classifiers are offered by Schott [2]:

1. The user needs to make sure that the training data is robust enough to

characterize fully the class. For example, a forest class should include samples

from difierent slopes, aspects, stand types, and densities. The addition, the

user must ensure that there are sufficient data to estimate adequately the

values of M (mean vector) and S (covariance matrix). 10/ is a practical

minimum, and 100/ is a desirable objective where / is the number of training

samples.

2. Multimodal classes should generally be split into separate classes for

classification and merged into a common class afterwards. For example.

256 forest classes on east-west facing slopes may be significantly different from

those on north-south facing slopes.

3. The user must ensure that all classes are included in the training process, or

untrained classes will be grossly misclassified.

Ideally, we would like to think that standard classification guidelines for given types of features should be used, with the aim of classifying images with a high degree of accuracy. Such a standard would allow large amounts of time and money to be saved and help reduce the practice of “throwaway classifiers.” Although some progress has been made in the past 2-3 decades towards developing standard signature sets for features that are invariant to changes in the landscape, in general repeatable classifiers remains an elusive goal. This is due to the fact that pixel retum alone is not enough to generate unambiguous classifications in an unstable landscape that is always changing, and may be due in part to the lack of a methodology for determining the limits of a given classifier’s ability.

We saw above that image classification in S AR tends to follow the remote sensing classification paradigm. Classifiers operate on single images using various supervised and unsupervised methods. The most common means to assess the accuracy is by means of a truth table (i.e. confusion or error matrix). In light of remote sensing theory and practice, this leads us to the following critique of current practices for performing SAR/InSAR image classification:

• There is no assessment as to whether the training classes fully characterize all

of the classes (it is easy to observe that they do not, especially when our

interest is cross-scene classification).

257 • An insufficient number of training examples are typically used (—100 per class

desirable; SAR classification training often uses < 10)

• In the assessment of accuracy, the training pixels are usually a subset of the

reference test pixels, skewing the representation of accuracy.

• In many cases error assessment is not done at all, but when practiced, rarely

includes the use of at least 50 collected samples per class for the error matrix,

via stratified random sampling (the preferred method).

Given these shortcomings, it is clear that a more rigorous means for assessing the accuracy in S AR/InS AR classiOcation is needed.

A 3 Paradigm 3: Machine Learning

The field of machine learning recognizes the general problem of image understanding—how to make a computer understand and interpret digital images—as the most difficult visual task and subject of most of the study in the field. Knowledge representation is one of the particularly difficult challenges. According to Rich and

Knight [15], the entire problem remains unsolved for reasons such as:

• An image is 2D while the world is 3D, and some information is lost at the

time the image is created.

• One image may contain several objects that may partially occlude others.

• Each pixel value is affected by many different phenomena, including the color

of the object, the source of the light, the angle and distance to the camera, the

pollution in the air, etc. It is hard to disentangle these effects.

258 As a result, 2D images are very ambiguous. Given a single image, we could

construct any number of 3D worlds that would yield the same image. For example, we

may bring in knowledge about low-level image features, such as shadows and textures.

Multiple or moving images or stereo vision (and in our case, InS AR) of the same object

or scene can also provide multiple views to help recover 3D structure. Additional sensors

such as a laser rangefinder that measures the distance to parts of an object or scene can be

helpful. Other image factors that might be incorporated include shading, color, and

reflectance. High-level knowledge can also be important for interpreting visual data. The

context and features in an object’s surroundings can provide a framework for

interpretation. The success of a machine learning program depends critically on the way

it represents and applies knowledge.

Machine learning addresses the question of how to build computer programs that

improve their performance at some task through experience. Much of learning involves

acquiring general concepts from specific training examples. The field draws from many diverse disciplines including artificial intelligence, probability and statistics, computational complexity, information theory, psychology and neurobiology, control theory, and philosophy [16]. Machine learning is especially useful in domains where the program must dynamically adapt to changing conditions. This means that learning systems should be well-suited to the task of learning to interpret ever-varying, real-world sensor data such as those generated by S AR/friSAR.

Briscoe and Caelli describe machine learning as:

"... a relatively new branch of Artificial Intelligence (AI). The field is currently undergoing a significant period of growth, with many new areas of research and development being explored.. .Learning is an essential component of any intelligent system, whether human, animal, or machine. Without learning, 259 systems are unable to profit from their experience or to adapt to changing conditions...[Machine] Learning involves both knowledge acquisition [the acquisition of new knowledge from external sources] and skill acquisition [the improvement of knowledge representations and structures so that existing knowledge may be better exploited].” [16]

Following an extensive literature review of hundreds of sources, Briscoe and Caelli categorize machine learning systems according to the outline in Table A.2. It is interesting to note that aspects of all categories are observed (at least in primitive form, and sometimes by different names) in the various types of SAR image classification methods, discussed in Chapter 7.

Symbolic Empirical Learning (SEL) Supervised (Learning from Examples) Decision Trees Star Methodology Version Spaces Least Generalization Inductive Logic Programming Unsupervised (Learning from Observation and Discovery) Conceptual Clustering Discovery

Analytical Leaming/Expianation-Based Learning Learning Composite Rules Learning Search Control Knowledge

Examplars, Case*Based Reasoning and Analogy Exemplars Case-Based Reasoning Analogical Reasoning

Integrated Learning Systems Combining various learning techniques Learning Apprentice Systems

Numerical Learning Systems Statistical Methods for Pattern Recognition Neural Networks Relational and Evidence-Based Methods Combinatorial Optimization and Numerical Search

Table A.2. Types of Machine Learning Systems [16] 260 In order to maintain the focus on solving problems instead of on particular systems or methods, Winston refers us to Marr's methodological principles[17]:

1. Identify the problem.

2. Select or develop an appropriate representation.

3. Expose constraints or regularities.

4. Create particular procedures.

5. Verify via experiments.

Although all five of these principles are important to developing an intelligent approach to the task of SAR image classification, we emphasize principle 2: an appropriate representation. This principle tends to be only weakly followed in the area of SAR image classification, yet it may be the most important. It is otherwise known as the representation principle:

Once a problem is described using an appropriate representation. The problem is almost solved. [17]

Man* describes representation as:

“... A representation is a formal system for making explicit certain entities or types of information, together with a specification of how the system does this...For example, a representation for shape would be a formal scheme for describing some aspects of shape, together with rules that specify how the scheme is applied to any particular shape...any particular representation makes certain information explicit at the expense of information that is pushed into the background. This issue is important, because how information is represented can greatly affect how easy it is to do different things with it [italics added]. For example, it is easy to add, to subtract, and even to multiply. Arabic [number] representations, but it is not at all easy to do these things—especially multiplication—with Roman numerals. This is a key reason why the Roman culture failed to develop mathematics in the way the earlier Arabic cultures had.” [18, p.21]

Chapter 7 examines the various representations used in SAR/InS AR image classification. It will become apparent that representing knowledge in the domain of 261 SAR/InSAR image classification is not an easy or simple task because the world is an

ever-changing, growing and moving environment. This is the same elusive problem

faced in image understanding, that “practically anything could happen in an image and

further than practically everything did” [18].

Since SAR/LiSAR instruments produce measurable physical effects,

incorporating both qualitative and quantitative physics should play a key role in

appropriate knowledge representation. Here we define qualitative physics as

understanding physical processes by building simple, abstract and nonnumeric models of

them; for example, things fall downward. Quantitative physics is the incorporation of

specific, mathematical quantities and variables that follow natural law. Yet the physics

of the scene are rarely incorporated in models pertaining to SAR image interpretation for

classification.

An important design issue is the type of training experience [14]. This can have a

significant impact on the success or failure of the machine learning system. How well

does the training experience represent the distribution of examples, over which the final

system performance must be measured? In general, learning is most reliable when the

training examples follow a distribution similar to that o f future test examples. In practice,

it is likely that the training experience will not be fully representative of the distribution

of situations over which it will later be tested. It is, in fact, better to leam from a distribution of examples that is somewhat different from those on which the final system will be evaluated. These situations are problematic because mastering one distribution of examples will not necessarily lead to strong performance over another. And in terms of

262 accuracy assessment, “observed accuracy over the training examples is often a poor

estimator over future examples. The measured accuracy can vary from the true accuracy

depending on the makeup of the particular set of test examples.” [14]

Nair et al. [19] provide guidelines for comparing the performance of object

recognition systems, a subset of machine learning. The framework emphasizes the

importance of measuring the computational costs of a given algorithm in terms of space

and time complexity. Here the costs of computing and storage are compared against

classification error and test data sample size. Computational costs should certainly be

taken into account when deciding what classification method to use. For example, an

algorithm that classifies a scene in 30 minutes should be rated higher than one that

requires 24 hours for the same scene, given equivalent accuracy rates.

Machine learning also deals with issues involved in assessing the accuracy of

classirication, as follows.

A.3.1 The Problem of Accuracv Assessment

Kononenko and Bratko [5] discuss problems in estimating performance of classification accuracy. In some cases, the percentage of correct classifications is not a very appropriate measure of performance. This is because: (1) different classifiers produce different forms of answers that cannot be directly compared; (2) prior probabilities need to be taken into account when establishing evaluation criteria, since correct classification into a more probable class is more likely than classification into a less probable class; and (3) it is difficult to compare performance across different domains. A problem with more classes is generally more difficult than a problem with fewer classes. The comparison of performance in different domains is questionable

263 because in different domains, different amounts of information may be available. The amount of available information also depends on the number of available training instances. There may also be other criteria that affect the classification assessment such as time and cost requirements. Thus, we see that the dominant mode of assessing SAR classification accuracy via the percent of correct classifications (e.g. the confusion matrix) is deficient when used as the sole evaluation criteria.

The field of machine learning offers several important points in critiquing the methodology for classifying SAR/InSAR data, as it is reported in the literature:

• A discussion of the representation and its appropriateness to the given

conditions is rarely offered.

• A discussion of the role played by the physics of the scene and sensor is rarely

offered.

• A discussion on regularities and constraints of the particular dataset or method

is rarely given.

• The algorithmic procedure for the classification mechanism tends to be

described in great detail, but other procedures are scarcely mentioned.

• The computational complexity in time and space is rarely mentioned. No

matter that the dataset size might have been only 500 pixels by 500 pixels, and

it took I gigabyte of disk space and 12 hours to classify it!

264 A 4 General Observations

A common thread is that there tends to be a lack of rigor in determining the

conditions for measurement and assessing the true level of uncertainty in the

measurements. This indicates the need for a better-defined methodology for carrying out

SAR/InSAR classification.

A higher level paradigm can be observed that guides the paradigms discussed

above. They all rely on set theory, which forms the foundation for classification practice.

In set theory, a thing (such as a pixel) must be labeled (i.e., classified) as a particular,

given class. Suppose we decide to use only two classes, A and B, and each thing (e.g.,

pixel) must be classified as one of these. The class labels are mutually exclusive. If

something is labeled “A”, it cannot also be labeled as “B”, and vice-versa. Now let us

suppose that the thing to be labeled is, in fact, comprised of 51% B and 49% A.

However, we are only permitted to label it one or the other. Since B has a greater

percentage, the pixel is labeled as B. However, this simple act produces an error rate of

49%, for a single pixel, for just this single act of labeling!

In the above example, it can be observed that there must be rules—formal or

implied, arbitrary, self-directed or outside-directed—that decide what constitutes an A or

a B. In the above simple case, our decision was that if the composition of one class in a

pixel is greater than 50%, then the entire pixel is labeled as that class. But we could have just as easily made up another rule, such as: if a group of nine adjacent pixels each have

more than 33% of one class, then all nine pixels must be labeled as that class, and so on

with an endless variety of classifications.

265 Because set theory—and the classification process that accompanies it—is a

dominating concept in SAR/InSAR classification, we will examine it and the problems it

brings in more detail. These problems can result in very significance variance from one

user or system to another. Here we take some time to further expound on the problems

with classification, since this can represent a very significant variance from one user or

system to another in SAR/InSAR classification.

AS The Problem of Classifications

A significant problem in classification practice is that the classes themselves tend

to be ill-defined. There seems to be no common agreement among the SAR image

classification community as to what constitutes forest, grassland, or other features. What

one classifier considers to be forest differs from another. The forest area/edges delineated

by one classifier differs from another, since terrain often exhibits a smooth transition

from one class to another. Densities are often mixed among classes, and even a single

class can exhibit significant variation. This condition makes objective comparisons on

classifiers difficult, even on the same scene.

Early work by Green and Swets describes the problem in general terms. In their

1966 classic on Signal Detection Theory and Psychophysics [6], the change in human

experimental response is studied with respect to different instructions. A wide range of

responses to the same data are possible, given the way that the observer (i.e., classifier) is instructed. Thus how a class is defined plays a central role in the outcome of classification; because class definitions tend to be somewhat arbitrary.

Without clear, common definitions, we arrive at the condition described by Booch

[7], after Stepp and Michalski [8] as “A Problem of Classification.” Referring to Figure

266 A 3, the case of ten different trains is presented, each with an engine and two to four cars, each shaped differently and carrying a different load. The reader is challenged to group the trains into meaningfiil sets. One person may create three groups of trains: one for engines with black wheels, one with white wheels, and another with both black and white wheels. The next person may group them into classes of either one, two, or three cars; and so on.

C.

E.

F. XT

G.

H.

J.

Figure A.3. Observers classify objects differently. [6]

267 As in real life, there is no “right” answer. In the experiment by Stepp and

Michaelski, subjects came up with 93 different classifications, with 43 of these being totally unique. Next Booch suggests changing the requirements, and allowing the circles to represent toxic chemicals, rectangles to represent lumber, and all other shapes to represent passengers. In Booch’s subject group, the clustering of trains changed significantly. Most subjects classified trains according to whether or not they carried toxic loads. The conclusion from this simple experiment is that having more knowledge about a domain (up to a point) makes it easier to achieve an intelligent classification.

In Chapter 8 the use of certain domain knowledge (i.e., feature classification standards) for thematic mapping is explored as a means to define the desired classifications. Without such a framework, classifications will continue to vary for different researchers, further complicating comparisons.

In summary, the following problems related to SAR/InSAR classification are noted;

• The feature classes and conditions under which they are classified tend not to

be clearly defined.

• The learning problem—how to group classes—is not well-defined.

• How to evaluate classifications is not sufficiently defined.

• The data set upon which a classifier is trained is not representative of the class

as a whole (across different images), leading to widely different results under

varying conditions.

• Algorithms are optimized on single scenes under very limited conditions.

268 • Models are often based on regional statistics only, and do not incorporate the

physics of the scene.

A.6 Conclusion

The above discussion points to the need for a stronger theoretical framework in which to conduct SAR/InSAR image classification. Chapters 2,3, and 4 discuss the need for such a framework and lay out a new methodology, in order to achieve more reliable classification results.

A.7 References

[1] Mitchell, Tom M., Machine Learning. New York: McGraw-Hill Companies, Inc., 1997.

[2] Schott, John R., Remote Sensing: The Image Chain Approach. New York: Oxford University Press, 1997, pp. 233-288.

[3] Defense Advanced Research Projects Agency (DARPA), “Image Understanding for Battlefield Awareness,” Broad Agency Announcement (BAA) 96-14, issued September 12,1996, Section C.

[4] Briscoe, Garry, and Caelli, Terry, A Compendium of Machine Learning. Volume 1: Svmbolic Machine Learning. Norwood, New Jersey: Ablex Publishing Company, pp. 11, 13.

[5] Kononenko, Igor, and Bratko, Ivan, “Information-Based Evaluation Criterion for Classifier’s Performance,” Machine Learning, 6(l):67-80, 1991, pp. 68-80.

[6] Green, David M., and Swets, John A., Signal Detection Theorv and Psvchophvsics. New York: John Wiley & Sons, Inc., 1966, pp. 34-35.

[7] Booch, Grady, Object-Oriented Analvsis and Design with Applications. Second Edition. 1994, Reading, Massachusetts: Addison-Wesley, pp. 145-168 (chapter on classiftcation).

[8] Michalski, R. and Stepp, R., Learning from Observation: Conceptual Clustering, in Machine Learning: An Artificial Intelligence Approach, (Ed.) R. Michalski, J. Carbonell, and T. Mitchell, Palo Alto: CA: Tioga, 1983.

269 [9] Robinson, Arthur H.; Sale, Randall D.; Morrison, Joel L.; and Muehrcke, Phillip C., Elements of Cartography. Fifth Edition. New York: John Wiley & Sons, 1984, pp. 107-134.

[10] Richards, John A., Remote Sensing Digital Image Analvsis: An Introduction. Berlin: Springer-Verlag, 1986, pp. 29-30, 225-233.

[11] McMaster, Robert B., and Shea, K. Stuart, Generalization in Digital Cartography. Washington, D.C.: Association of American Cartographers, pp. 99-121, 1992.

[12] Schowengerdt, Robert A., Remote Sensing Models and Methods for Image Processing. Second Edition. San Diego: Academic Press, pp. 11-16.

[13] Jensen, John R., Introductory Digital Image Processing: A Remote Sensing Perspective. Second Edition. Upper Saddle River, New Jersey: Prentice Hall, 1996, pp. 247-262.

[14] Mitchell, Tom M., Machine Learning. New York: McGraw-Hill Companies, Inc., 1997, pp. 5-18,145-151.

[15] Rich, Elaine, and Knight, Kevin, Artificial Intelligence. Second Edition. New York: McGraw-Hill, Inc., 1991,621 pages.

[16] Briscoe, Garry, and Caelli, Terrv. A Compendium of Machine Learning. Volume 1: Svmbolic Machine Learning. Norwood, New Jersey: Ablex Publishing Company, pp. 1-9.

[17] Winston, Patrick Henry. Artificial Intelligence. Third Edition. Reading, MA: Addison-Wesley Publishing Company, 1993,737 pages.

[18] Marr, David. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: W.H. Freeman and Company, 1982, pp. 1-41.

[19] Nair, Dinesh, Mitiche, Amar, and Aggarwal, J.K. “On Comparing the Performance of Object Recognition Systems,” Proceedings of the Second International IEEE Conference on Image Processing. October 23-26.1995. Washington. D C. Los Alamitos, CA: IEEE Computer Society Press, pp. 631-634.

270 APPENDIX B

DISCUSSION OF ISSUES RAISED IN THIS INVESTIGATION

A number of issues arose in the course of this research. Further consideration is given here to several issues, including: backscatter modeling; the validity of a Gaussian assumption in error modeling; further discussion on the measurand concept; and a preliminary methodology for designing a production mapping project using SAR/InSAR data. These are discussed below.

B .l. The Normal Distribution

If we assume that SAR/InSAR data are distributed normally, then when N independent looks are averaged, the estimates will cluster closer to the actual value.

According to the Central Limit Theorem [1], by increasing the population size from one to eight independent observations, we reduce the uncertainty of the mean by a factor of

Vn . This means that for an 8-look image (N=8), the uncertainty over a single-look image will be reduced by a factor of 2.8. Then if our original uncertainty was within a range of

50% of the measurement, under an 8-look scenario the amount of uncertainty due to speckle would be reduced to a range of 18% of the measurement.

271 However, the dominant assumption of a Gaussian distribution of the data, found

throughout the SAR literature, must be questioned. On the one hand, it is claimed that

we can do a form of convolution (such as multi-looking) on parts of the image with

neighboring pixels to arrive at stronger local probabilities, because the data are spatially

autocorrelated. But on the other hand, it is claimed that a Gaussian distribution applies

because the multilooking data measurements are fully independent. [2] But how can the same data measurements be both fully independent and spatially autocorrelated?

B.2 Spatial Averaging in SAR

The nature of the methods used for spatial averaging in a SAR image must also be questioned. The methods in use essentially discard the phase data (using as justification the “fact” that the phases cancel each other out), and average only the signal amplitudes.

However, coherent electromagnetic waves follow the laws of linear superposition, by which both the amplitude and phase are summed together. This is a very different principle than averaging.

We take as an example the two electromagnetic waves shown in Figure B.l. Let us assume for simplicity that one has an amplitude of 1 A, and the other 3 A. They have the same period and are in phase. We examine the differences for the interval 0 to 2 tc.

Under a simple averaging scenario, the phase disappears and only the amplitudes are left

(we use the maximum amplitude for simplicity). The average is then given by (A +3A)/2

= 2A, which is then assigned back as the newly “smoothed” amplitude to both pixels in the image. But under the superposition scenario: Asincot + 3Asino)t = 4Asino)t, the combined result would be as shown in Figure B.l (right). Either by looking at the combined waveform in each case, or by examining the area (i.e., integral) under the

272 71/4 n 0 7C ' ^ _____ M \ !

Figure B.l. Comparison of simple averaging (left) with linear superposition (right).

curve, it is clear to see that the results are quite different. Earlier it was observed that the amplitude, phase, and polarization of multiple scatterers in a resolution cell combine coherently by linear superposition, into a single vector with a combined phase, amplitude, and polarization, which is then recorded at the antenna. When a form of “averaging” is sought, would it not therefore make a truer representation of the signal to combine the single vectors (one per pixel) using superposition into a larger “scattering” resolution cell? One can conjecture that the result may simply be larger-sized resolution cells, with an even greater variance (speckle) effect.

B. 3 Backscatter

Although we chose an arbitrary, mid-range value for the backscatter a and its uncertainty jXo in Section 5.1, in fact, owing to its apparently random nature, a wide range of values would be equally valid for either the backscatter or its uncertainty. Backscatter values can range from +15 to —50 dB and beyond. When this is translated from a logarithmic to a regular linear scale, the range is 7 or 8 orders of magnitude! For this reason, further discussion on the nature of backscatter is warranted. In the example used

273 in Section 5.1, we considered backscatter simply as a source of uncertainty. That is, from the perspective of the measurement device(s), the values obtained for cr appear to be random and therefore, quantities of uncertainty. Curlander summarizes the standard treatment of backscatter with the following statements (after Ulaby):

“Let us begin by introducing the notion that the specific radar cross section (To for a terrain element is appropriately considered as a random variable...The value of (To taken over a collection of nominally similar terrain elements will therefore not be constant, but will rather appear to be multiple realizations of a random quantity. The implication of this is that it is usually unfruitful to attempt to define a single deterministic backscatter coefficient for each terrain element and to replicate the terrain map of (To in a SAR image...even if a terrain element dA contained one, or at most a few, dominant point scattering centers, so that a single deterministic value (To might apply, aspect dependence may make the value (To change in an apparently random fashion.” [3]

But is (To appropriately considered as a random variable? A random variable is defined as a numerical quantity associated with a chance-influenced experiment or phenomenon. It is random because the value it takes is not known with certainty before the experiment is performed. [4] Or, a random variable is a variable whose value is a numerical outcome of a random phenomenon [5]. A continuous random variable is associated with a probability density function (PDF), which is a nonnegative function g defined on the real line, such that the total area of a region that is bounded by graph of g and the horizontal axis is equal to one [6]. If this is the case, then a PDF should be associated with SAR backscatter measurements. However, if you take an empirical distribution function of a random variable, then repeat the procedure for another sequence of measurements, most likely a different distribution function will result. Therefore, the measurements of a random variable as compared to the random variable itself are linked only by the range of the observations and “the image of the random variable” [7].

274 Yet, we have seen before that SAR/InSAR measurements are not totally random, but are spatially correlated in some way. In fact, Stein et al. in Spatial Statistics for

Remote Sensing [8] observe the contrast to single observations that are treated as independent quantities;

“From the viewpoint of spatial statistics it is instructive to consider the relations which exist between the pixels in a given image. The spatial variation in an image is determined by the interaction between the underlying variation and the sampling framework (which in remote sensing terms means the spatial resolution). Thus, even though in remote sensing we may not have control over the sampling framework, it is important to know about the relation between what is at the surface and how we sample it Since the utility of any specific statistical technique may depend on the nature of the spatial variation in the remotely sensed data, the choice of spatial resolution [or, in the case of SAR/InSAR, the relation between wavelength and resolution cell] in relation to the underlying frequencies of spatial variation becomes fundamentally important.” [brackets added]

And later, relating classical statistics with spatial statistics:

“A problem with classical statistics (in the simplest sense) is that data are assumed to be independent. For pixels arranged on a regular grid it is unlikely that complete statistical independence will be achieved. In particular, it is likely that the values in neighboring pixels will be similar...TTiw so-called spatial dependence, defined as the tendency for proximate observations to be more similar than more distant ones, invalidates the independence assumption of classical statistics.” [9, italics added]

The authors then point to the field of geostatistics for acquiring methods of spatial statistical analysis. Yet, the degree to which SAR/InSAR is random vs. the degree that it is correlated, and the relation of the correlation with underlying spatial pattern, wavelength, and resolution cell size has not been determined—and perhaps has not even been considered.

Next we explore another area that may offer some insights on the problem of data that are random in some aspects, but not in others: stochastic geometry [10]. This is the study of random processes whose outcomes are geometrical objects or spatial patterns,

275 which are random subsets of s f . The main idea of stochastic geometry is to find connections between geometry and probability—that is, between the geometrical and probabilistic aspects of random spatial processes. The notion of independent random variables is scarce in stochastic geometry, in recognition that random spatial processes do not exhibit complete independence. We will not explore this further here, other than to say that stochastic geometry may offer insights in modeling the characteristics of

SAR/InSAR data, as an improvement over the current “random variable” model.

From a different perspective, the backscatter might be considered as an influence quantity, defined as an item which has an effect on a stabilized output quantity (i.e., for which all other parameters can be fixed and known in advance) [11]. In engineering terms, influence quantities may correspond to variable factors such as source voltage, source frequency, load, temperature, or time [12]. Another definition of influence quantity is “...a quantity that is not the measurand but affects the result of the measurement.” [13]

Or, backscatter may be more appropriately considered as:

“...a systematic measurement error, produced by influence factors such as ambient temperature, humidity, voltage, external electrical noise, internal electrical noise, temperature gradients, mechanical vibration and so on. While some of these influences produce random errors, many also produce systematic errors, that is errors which remain constant while the factor producing them remains constant.

“Ideally, the various influence quantities would be investigated by holding all constant except for one and for this to be varied in a controlled manner. Once the relationship was characterised a correction could be applied. Unfortunately it is often difficult, if not impossible, to either measure or control the influence quantity with sufficient accuracy to be able to apply comprehensive corrections, thus our imperfect knowledge of these effects gives rise to an uncertainty in the measurement due to residual or uncompensated systematic errors. Even after such corrections were applied, there would be residual errors arising from the uncertainties of the correction determination.” [14] 276 It can therefore be observed that the question of how backscatter, as a measured

quantity, should be treated in general statistical terms has scarcely been considered, and

certainty not resolved with clarity! Its status as a “random variable” appears to be

accepted as a nearly universal fact, and treated as such in analyses. Yet, if this treatment

is fundamentally flawed, then the statistical means by which all subsequent measurements and image interpretations are based are ill-founded. In other words, if the statistical framework for a measured quantity (i.e., backscatter) has not been well established, how can that measured quantity itself be adequately and accurately manipulated within a statistical framework?

B.4 The Measurand Revisited

Until now, we have assumed that the measurand in SAR is the model by which a real feature or quantity is represented. So, for example, when we were talking about the returned power, that was the measurand. When we were talking about the backscatter, that was the measurand, and so on. However, by definition, this quantity must be able to be measured repeatably and reproducibly. Yet, as we saw, the interaction of the signal with the terrain yields a large percentage of unknown quantities that are neither repeatable, reproducible, nor uniquely represented. The underlying problem is discussed below.

277 One pixel area on ground: SAR representation of same area: Tree, grass, building, lake X Amplitudes, phases & polarizations from all features in resolution cell

Figure B.2. Comparison of world and SAR representation of the world.

The real problem with the way that SAR/InSAR data are represented (due to the nature of the signal) is that an unknown number of different features in a resolution cell are mapped to one simple waveform with an amplitude, phase, and polarization. This is depicted in Figure B.2 (left), in which a tree, lake, grassy area, and house are contained within a single resolution cell or pixel. The reflected SAR signal for the same resolution cell is depicted in Figure B.2 (right), in which all of these elements are summed into a single pixel response that contains the combined amplitudes, phases, and polarizations.

Yet there is virtually no way to separate the signal back out into the individual components that made up the sum! This is true whether the part of the scene being imaged is apparently homogeneous or not, due to wavelength-level interactions within the resolution cell and scatter from neighboring pixels.

Another way to look at this is to see how ill-defined the problem is. Let us assume that our measurements are discretized into whole numbers only, according to the following scales:

278 1- Our amplitude A has 256 allowable values, to fit into an 8-bit byte.

2. The phase angle (p ranges from -180° to +180°, for a total of 360 values.

3. Polarization ellipse: ellipticity tilt ranges from -90° to +90°, for a total of

180 values; and ellipticity angle % ranges from -45° to + 45°, for a total of 90

values.

We represent this in the following way: in the unconstrained case, F(A, (p, y/, %)

F(256, 360,180, 90) possible combinations for each pixel return. If we extend this out to the total number of combinations, we obtain:

256 X 360 X 180 X 90 = 1,492,992,000 or ~ 1.5 billion possible combinations.

This is only the range of responses for a single pixel! In other words, in exchange for four parameters and their respective measurements, we receive back 1.5 billion unknowns. Add to this the complexities of multiple wavelengths, area, volume, material geometry and composition, roughness, look angles, flying heights, resolutions, multilooking, etc., it becomes clear why the inverse problem (i.e., inferring unique characteristics about a feature and determining its classification from these four parameters) is such a daunting task.

No matter how well the uncertainties are controlled and corrected for in the instrumentation; no matter how well the system is calibrated; no matter how clever, complex, and sophisticated the implementation of a classification routine—the results come down to these four basic observable quantities, from which all others are derived.

Since these four basic quantities are functions of an infinitely-varying world, what can be done?

279 First, severely restrict the domain. The type of classification studies that have the

greatest chance for reasonable outcomes are those with the simplest requirements; for

which the information sought aligns with distinctive characteristics of the data; and has

the least environmental variability. One of the most dominant types of domain

restrictions in use today is to analyze only a single image, and perhaps even that image

has been pre-selected for favorable features or characteristics. Another example of

domain restriction is in classifying the age of sea ice, according to first-year rough, first-

year smooth, and multi-year ice, which each have distinct characteristics that can be

isolated using the SAR data [15]. (Even so, the reported accuracy varied from 50-90%,

depending on the frequency, polarization, and classification method used, for the single

multi-frequency, multi-polarization image used.) Yet another example is the classification of two types of trees in a forested area dominated by the same two types, which have distinct characteristics in the SAR signal. Basically, in restricting the domain, the probability for attaining correct inverse mappings improves, as shown in Figure B.3.

Figure B.3. Restricting the domain (right) offers a greater probability to infer the scatterer properties and characteristics from the signal.

280 Second, add more observables. For example, some models require the user to

input a number of variables about the vegetation height, density, trunk size, etc., while

others require entry of soil moisture or texture of the ground (as in average size of soil

aggregate in a given homogeneous region). The use of multiple sensors and datasets of

SAR plus visible, infrared, multispectral, or laser data is another way to increase the

probability (or conversely, to decrease the uncertainty) of correct classification. What

one sensor is unable to discriminate, another often can [16]. However, sensor/data fusion

efforts with SAR are scarcely reported in the literature, perhaps due to the large amount

of effort and computing resources required or because the work is classified.

Third, define ranges of dataset values and statistics that characterize certain

landscape features. This is further described in the next section.

Fourth, use the spatial structure and distribution of the data, as well as expert knowledge of the terrain characteristics, to infer feature types.

But it is a combination of these that is likely to generate the most success in classifying SAR/InSAR data: severely restricting the domain; adding more observables; defining data ranges for feature classification; and using spatial structure. Even so, we are left with the issues of defining the measurand and the lack of absolute references against which feature data can be measured.

B.5 The Measurand and Preliminary Methodology

This section further develops the concepts discussed earlier of measurand, certified reference material and traceability, and uncertainty as they apply to the

SAR/InSAR signal. We first revisit the basic concepts as presented in GUM. The measurand may also be described in the following way:

281 “This Guide is primarily concerned with the expression of uncertainty in the measurement of a well-defined quantity—the measurand—that can be characterized by an essentially unique value. If the phenomenon of interest can be represented only as a distribution of values or is dependent on one or more parameters, such as time, then the measurands required for its description are the set of quantities describing that distribution or that dependence...Because a measurement result and its uncertainty may be conceptual and bzised entirely on hypothetical data, the term “result of a measurement” should be interpreted in this broader context.” [17]

Three main points in this excerpt deserve further attention. First, a given measurand is well-defined and “essentially” unique; second, a measurand can be described in terms of a set of other measurands; and third, a measurand can also be described by a distribution of values if its “true” value is unknown.

We define Y as the measurand and p as the uncertainty. Following the above guidelines, V must be well-defined and unique, and may be represented by itself or, if the

“true” value is unknown, by any of the following properties:

Y = Y , -1-Y2+Y3+.-. +Y„

Y = Yo + Po

Y = (Yi + pi) 4- (Y2 + P2) + (Y3 + p 3) + ... + (Y„ 4- Pn)

Further, NIST 1297 states the following about measurand reporting:

“When reporting the estimated value and uncertainty of a measurand, one should always make clear that the measurand is defined by a particular method of measurement and indicate what that method is. One should also give the measurand a name which indicates that it is defined by a measurement method, for example, by adding a modifier such as “conventional”... Execution of test methods according to standards should be expressed in terms of defined measures of repeatability and reproducibility.” [18]

The points of interest in this second quote are that: (1) standard methods should be developed to determine the constituency of a given material, and these

282 methods should be fully described and named; (2) when these methods are applied, they should be expressed in terms of repeatability and reproducibility.

Therefore, in keeping with these guidelines, the following points are proposed for describing the measurand:

1. A set of standard methods might be developed to define how the measurand is to

be defined and represented, for which the testing conditions should represent the

widest possible range of environmental conditions. A statement of uncertainty

should also be provided about the limitations of the testing conditions themselves.

2. A different set of measurands would apply for each sensor platform according to

parameters that are constant and unvarying to that platform. That is, a given set

of measurands apply for a given wavelength(s), polarization(s), resolution,

incidence angle, pulse repetition frequency, processing parameters, etc.

Therefore, the measurand descriptions for each platform will be different. As

possible, quantitative mathematical models should be developed using ANOVA

or other statistical techniques, that will support model development for that

particular sensor and set of conditions.

3. The measurand is to be described to the maximum extent possible, along with the

limitations of the description and of the boundary conditions.

4. The measurand is defined as a real-world feature according to certain specific

criteria (such as might characterize a forest); however, it is represented (the

representational model) by a unique and specific combination of signal and spatial

ranges and combinations. (The degree to which the classifier methods also need

283 to be standardized remains to be seen, but even classification standardization is

recommended at the early phase.)

For example, suppose that our dataset is SIR-C/X-S AR in dual polarization mode, as recently flown on the space shuttle’s SRTM mission. We assume that the global dataset has been processed under uniform conditions, and all indicated ground control, radiometric, geometric, bias, etc. corrections have been made to the data. A statement of uncertainty for the data should be included, according to known instrument and processing errors. The following steps are proposed, following our methodology:

1. Define the desired set of final features to the maximum extent possible. For example,

if the category is forest: define the height, density, vegetation types, etc. that comprise

a forest.

2. Identify a set of unique characteristics for each feature type in the SAR/InSAR signal

to the maximum extent possible, using combined statistics from polarimetry,

interferometry, magnitude, correlation, phase, etc. Thoroughly test for these

characteristics using a wide range of landscapes and conditions, such that all types of

features defined as “forest” are tested insofar as possible. Ambiguities and

limitations of the procedure should be noted and included in a statement of

uncertainty. The range of statistics found in the data, under the stipulated set of

conditions that reliably describe each feature type will become the defined

measurand. Each feature type will have a unique statistical range of data values, to

within a stated range of uncertainties. Thus, the measurand for “forest” will be

characterized by the following statement:

Yforest= (Yi + Hi) + (Y2+ H2> +(¥3 + ^3) + ... + (Y„ + p„), where

284 (YI + Hi+) corresponds to one particular set of processed SAR/InSAR data for which

(a) the feature definition, (b) the dataset characteristics, (c) the dataset uncertainty,

and (d) the known limitations are specified.

For example, suppose:

(Y i + P i) = a given range (or probability) of polarimetric characteristics

of the dataset and its limitations, such as volume scattering

(for forest) or some combination in areal extent of volume,

double or single-bounce statistics.

(Y2 + P2) = a given range (or probability) of elevation and aerial

characteristics of the dataset, such as “if the region is taller

than ‘X’ and its areal extent is greater than ‘X’ then region

= forest.”

(Y3 + p 3) = a given range (or probability) of texture measures across the

dataset, such as regional statistics on the magnitude data of

small, rapid changes.

Frequency and polarization band ratioing, and many other statistical measures are

possible. The key is to arrive at a set of ranges or probabilities for the combined datasets

(i.e., if conditions A + B + C are satisfied, then feature = forest) that are unique to a

given feature type, to within a known uncertainty, to the maximum extent possible. In cases where the remaining ambiguity is too large and the level of uncertainty is

unacceptable, additional data sources should be incorporated to bring the uncertainty level to within an acceptable range. It can be noted that the combination of uncertainties

285 Hi + ^ 2 + may be able to be modified downwards in magnitude, because the addition of more datasets has the potential to reduce the uncertainty—although careful consideration should be given to the limiting conditions for each dataset.

If the measurands can be defined and described to within an acceptable range of uncertainty (with or without further data sources), the minimum sets of data (e.g., polarimetry, coherency, frequency band ratio) should be developed, first for a prototype study, and the classification methods and workflow fine-tuned and made into an efficient and mostly-automated workflow. Once this is done, then the methodology can be applied to a larger area and, finally, full production. However, all of this is predicated upon the ability to determine unique measurands and uncertainties (and models, when possible) given an established set of SAR/InSAR platform and processing configurations, as tested under a wide range of real landscape conditions—and such a feasibility has yet to be established. Therefore, the first step must be to design and conduct experimental tests on a given type of dataset (such as the SRTM data), for which a given set of capture and processing conditions were earlier established and closely followed.

Next, we apply the concept of certified reference materials, traceable to a recognized standard, to our methodology. Certified reference materials (CRMs) are defined as:

“...materials that have been analysed by many different laboratories, so that statistically reliable values can be assigned to certain of their constituents. CRMs are used internationally as a control to verify the accuracy and precision of instrumentation or analytical methods. The use of reference materials is important in the drive towards quality accreditation, and also in checking that the analytical results from different laboratories compare well.” [19]

286 Because of the naturally-varying and generally random distribution of natural and man- made landscape features, we cannot apply the concept of reference materials to the landscape itself, as might be desirable, beyond the statistical definitions and uncertainties as expressed by the measurand. However, it is important to apply the concept of reference materials to the calibration of SAR/InSAR instruments, and for this purpose the use of one or more standard, common calibration ranges is suggested, which incorporates a wide variety and distribution of feature types, variable terrain, known moisture, precipitation, and vegetation conditions (which are perhaps regularly recorded), comer reflectors, highly detailed and accurate ground truth, DEMs, and feature maps for comparison purposes. Further research is needed to determine the optimal design, location(s), and maintenance requirements of such calibration ranges and the design and maintenance of an associated certification methodology. It is likely that the certification methodology (and perhaps certain characteristics of the calibration range) will need to be established in different ways for different sensor configurations.

B.6. Additional Paradigms to Consider

In Appendix A, SAR/LiSAR thematic classification was described as being adapted from (at least) three paradigms: Remote Sensing, Machine Learning, and Map

Generalization. Here we investigate two further paradigms that may shed light on how to classify many SAR/InSAR maps across a large area, with the goal of minimum human intervention. The two paradigms are remote sensing change detection, and measurement standards for chemical analysis. Change detection is of interest because it deals with

287 changes of the same ground area over time, while our interest is in changes of different ground areas across space but at the same time. In fact, there are many parallels as we will see.

Alternate Paradigm 1: Remote Sensing Change Detection

The input for the change detection process is typically comprised of two classified maps of the same area, created from imagery captured at different times. In order to obtain a reliable map of changes, both the sensor and scene characteristics of the image at time ti must be similar to those at time t?, and similar classification methods must be applied. We examine this topic because of its dependence on invariance for gaining similar classification results. It opens the discussion on what factors promote scene invariance (e.g. the use of classifiers over a large number of images) that might be useful to the SAR image classification community.

In the discussion that follows, Jensen [20] lays out a strategy for successful remote sensing change detection. This strategy requires careful attention to both (I) the remote sensor systems and (2) environmental characteristics. Various parameters can influence the change detection process, leading to inaccurate results. Ideally, the remotely sensed data used to perform change detection should be acquired by a remote sensor system that is constant in temporal, spatial (and look angle), spectral, and radiometric characteristics. Each of these factors is described below.

288 The Remote Sensor Systems

Temporal resolution: Data should be obtained at approx-imateiy the same time of day to eliminate effects of the diumal sun angle, and should be obtained on anniversary dates to remove (a) the seasonal sun angle and (b) plant phenological differences

(explained later).

Spatial resolution and look angle: Accurate spatial registration of at least two images is necessary for digital change detection. Ideally, the remotely sensed data should be acquired by a sensor with the same instantaneous field of view (IFOV) on each date.

(For example, TM data collected at 30x30 m spatial resolution on two dates are relatively easy to register to one another.) If using two different sensor systems with different

IFOV’s, a uniform pixel size (e.g. 20x20m) should be resampled for both datasets. The two images should be acquired with approximately the same look angle.

Spectral resolution: The same sensor system should be used to acquire imagery on multiple dates, since different sensors sample varying spectral ranges. When this is not possible, bands should be selected that approximate one another.

Radiometric resolution: Some satellite remote sensors generate digital data in 8- bit brightness values ranging from 0 to 255. Ideally, the sensor systems collect the data at the same radiometric precision on both dates.

Environmental Characteristics

It is desirable to hold environmental variables as constant as possible in order to produce reliable change detection results. The most important variables include:

Atmospheric conditions: There should be no clouds or extreme humidity on the days remote sensing data are collected. Even a thin layer of haze can change spectral

289 signatures in satellite images. The use of anniversary dates helps to ensure seasonal similarity between the atmospheric conditions. For mountainous areas, topographic effects (of atmospheric layers) may also have to be removed.

Soil moisture conditions: Ideally, the soil moisture conditions should be identical on the two dates. Extremely wet or dry conditions on one of the dates can cause serious change detection problems. Precipitation records should be reviewed to determine how much rain or snow fell in the days and weeks prior to remote sensing data collection.

When soil moisture differences between dates are significant for parts of the study area

(perhaps due to a local thunderstorm), it may be necessary to stratify the affected areas and perform separate analyses.

Phenological Cycle Characteristics: Most natural and man-made ecosystems experience seasonal cycles. These cycles dictate when remotely sensed data should be collected to obtain the maximum amount of usable change information. Therefore, both the biophysical characteristics of the vegetation/soils/water ecosystems and the development cycles of man-made phenomena (such as urban development) should be well understood and similarities in both images should be sought as much as possible.

Vegetation growth follows diumal, seasonal, and annual phenological cycles.

Obtaining near-anniversary images greatly reduces the effects of seasonal phenological differences. Such differences may cause change to be incorrectly detected in the imagery.

When attempting to identify change in agricultural crops, it is important to be aware of when the crops were planted. Monoculture crops (e.g. com and wheat) tend to be planted at approximately the same time of year on the two dates of imaging. A one-month lag in . planting date in fields in two images with the same crop can cause serious change

290 detection error. Second, the monoculture crops should be the same species. Different

species of the same crop can cause the crop to reflect energy differently on the multiple

dates of anniversary imagery. In addition, changes in row spacing and direction can have

an impact. Therefore, the crop biophysical characteristics as well as the cultural land-

tenure practices should be known in the study area, so that the most appropriate remotely

sensed data can be selected for change detection.

Natural vegetation ecosystems such as wetland aquatic plants, forests, and rangeland have unique phenological cycles. For example, the phenological cycles of cattails and waterhlies dictate the most appropriate times for remote sensing data acquisition. The spatial distribution of cattails is best seen in remotely sensed data acquired in the early spring (April or early May), when the waterhlies have not yet developed. Conversely, waterhlies reach their full development in the summer, meaning that late summer or early fall is a better period to capture their distribution.

Man-made ecosystems also have phenological cycles. Novice image analysts tend to assume that change detection in the urban-rural fringe captures residential development as either rural undeveloped land or as completely developed residential.

However, Jensen reports at least ten stages of residential development including clearing, subdivision, transportation, buildings, landscaping, and other characteristics, and all 10 stages are captured in the imagery. Some stages may appear spectrally similar to other phenomena. Therefore, it is important to be aware of the phenological cycle of both the urban phenomena being investigated and the natural ecosystems.

291 For coastal change detection, tidal stage is a another important factor in satellite image scene selection and the timing of aerial surveys. For most regions, images acquired at mean low tide are preferred.

What lessons can we learn from remote sensing change detection that can be applied to repeatable SAR classification? For example, SAR change detection (or alternatively, mapping across large areas) may best be performed when the data is captured by the same sensor with the the same angle, time of year, spectral bands, etc.; and when the conditions of the landscape itself are as similar as possible with similar soil moisture, season, phenology, etc. We know that SAR suffers from some of the same problems in image classification as standard remote sensing practices; therefore, we would do well to give attention to the importance of imaging scenes that are as similar as possible, if we want our classifiers to yield reliable results. Certainly the same conditions apply and for the same reasons. Therefore the design of sensor, capture date and season, scene similarity, and other factors will help aid our quest for reliable classifiers that perhaps can be used for a large number of scenes.

Alternative Paradigm 2

When chemical materials are analyzed, it is often necessary to separate out the effects of other compounds and factors that may make accurate identification difficult. In

Chemical material analysis, the actual percentage of certain components is obtained.

This contrasts with the current mode of classification, which relies on set theory. A ground pixel area is typically only permitted to be “in” or “out” of a given class. Is it

292 possible to quantify landscapes by percent composition, and if so, how? The parallels between the two topics are strong, yet their respective ways (or paradigms) of dealing with the topics are quite different.

■ A chemical analysis is an attempt to determine, using instruments and physical testing methods, the composition of a sample. So is landscape mapping. While chemical analysis may ask the question, “is element A present in the sample” (to which a set theory solution of “yes” or “no” is possible), a more typical question is, “what is the percent composition of various materials in this sample; how much of elements A, B, C, etc., are present, and to what degree of accuracy?” Perhaps it would make more sense to classify landscapes in this way, given their similarities, but at different scales. A chemical sample is usually small (a few grams or less), while an image classification sample may be several meters or even kilometers square.

Chemical material samples can be of uniform density and composition

(homogeneous, isotropic, isomorphic, etc.), of non-uniform density and composition, or else parts of a sample may be uniform and while other parts of the sample are not. This same situation is encountered in remote sensing classification. Both chemical sample analysis and landscape analysis rely on a determination of feature (elemental) density and mixture in the form of compounds. This principle is depicted in Figure B.4.

A chemical analysis relies on methods that may have been established over centuries to determine the composition, and are known to be reliable. A similar methodology to that used to determine the composition of chemical samples may be useful in remote sensing classification.

293 Landscape Analysis Chemical Composition Analysis

Case 1: find element

Determine elements that comprise a Determine whether a physical sample scene using a series of tests. has a particular element in it. Ex: locate tree. Ex: is sodium present in the sample?

Case 2: find compound

Determine compound Determine whether a compound element Elements. Ex: forest. present. Ex.: Sodium Chloride.

Case 3: find percent

Determine what percent of trees Determine what percent of elements are present in a given sample, by are present in a given sample, by volume area. Ex.:50% deciduous, 50% pine Ex: 50% sodium, 50% chloride

Figure B.4. Comparisons between map classification and determining constituency of chemical compounds

294 This is not to suggest that only two alternative paradigms exist in all the realm of science, but rather to understand that there are a large variety of possible paradigms from which a framework for theory and practice can be constructed. Theoretical and experimental research is needed to verify whether or how concepts could apply to

SAR/InSAR thematic mapping.

B.7. References

[1] McGrew, J. Chapman, Jr. and Monroe, Charles B., An Introduction to Statistical Problem Solving in Geography, Dubuque, Iowa: William C. Brown Publishers, 1993, pp. 114-116.

[2] Raney, Keith R., “Radar Fundamentals: Technical Perspective,” from Principles and Applications of Imaging Radar. Manual of Remote Sensing. Volume 2. Floyd M. Henderson and Anthony J. Lewis, Editors, New York: John Wiley & Sons, Inc., 1998, pp. 70-80.

[3] Ulaby, F.T., Moore, R.K., and Fung, A.K., Microwave Remote Sensing: Active and Passive, Volume H, Reading, Massachusetts: Addison-Wesley Publishing Company, 1982, p. 476.

[4] Curlander, John C. and McDonough, Robert N., Synthetic Aperture Radar Systems and Signal Processing, New York: John Wiley & Sons, Inc., 1991, p. 92.

[5] URL http://www.stat.ucla.edu/~rgould/154alwww/chatroom/0057.html, accessed 7/21/01.

[6] URL http://engineering.uow.edu.au/Courses/Stats/File32.html, accessed 7/22/01.

[7] URL http://www.math.ohiou.edu/~just/MATH250/convar2p.htm, accessed 7/22/01.

[8] URL http://random.mat.sbg.ac.at7~ste/dipl/node7.html, accessed 7/21/01.

[9] Atkinson, Peter M., “Spatial Statistics,” from Spatial Statistics for Remote Sensing, edited by Alfred Stein, Freek van der Meer, and Ben Gorte, Dordrecht: Kluwer Academic Publishers, 1999, pp. 60-61.

[10] Baddeley, Adrian, “A Crash Course in Stochastic Geometry,” University of Western Australia, URL http://maths.uwa.edu.au/stgeom, 36 pages.

[11] URL http://sl-proi-bi-specification.web.cem.ch/si-proi-bi- specification/Activities/Glossary/glossary v 1 .pdf, accessed 7/22/01. 295 [12] URL http://www.keDCOwer.eom/sl.htm#I. accessed7/22/01.

[13] URL http://www.ptw.de/ptw htm/service/download/specif.pdf, accessed 7/22/01.

[14] URL http://www.nata.asn.au/Iibrarv/sounest.html, accessed 7/22/01.

[15] Skokr, M.E., Wilson, L.J., and Surdu-Miller, D.L., “Effect of Radar Parameters on Sea Ice Tonal and Textural Signatures using Multi-Frequency Polarimetric SAR Data,” Photogrammetric Engineering and Remote Sensing, Vol. 61, No. 12, December 1995, pp. 1463-1473.

[16] Klein, Lawrence A., Sensor and Data Fusion Concepts and Applications. Second Edition. Bellingham, Washington: SPIE—The International Society for Optical Engineering, Volume TT35, 1999, 226 pages.

[17] International Organization of Standardization, Guide to the Expression of Uncertainty in Measurement, First Edition, printed in Switzerland, 1995, pp. 1-10.

[18] Taylor, Barry N. and Kuyatt, Chris E., NIST Technical Note 1297, 1994 Edition, Guidelines fo r Evaluating and Expressing the Uncertainty o f NIST Measurement Results, Gaithersburg, Maryland: National Institute of Standards and Technology, September 1994, Appendix D.4.

[19] URL http://www.mintek.ac.za/ASD/sarms.htm. accessed 7/22/01.

[20] Jensen, John R. Introductory Digital Image Processing: A Remote Sensing Perspectjye. Second Edition. Upper Saddle Rjyer, New Jersey: Prentice Hall, 1996, pp. 250-262.

296 BIBLIOGRAPHY

Anderson, James, Hardy, Ernest, Roach, John, and Witmer, Richard, A Land Use and Land Cover Classification System for Use with Remote Sensor Data. U.S. Government Printing Office, Washington D.C., 1976.

Aronoff, Stan, “Classification Accuracy: A User Approach,” Photogrammetric Engineering and Remote Sensing, Vol. 48, No. 8, August 1982, pp. 1299-1307.

Aronoff, Stan, “The Map Accuracy Report: A User’s View,” Photogrammetric Engineering and Remote Sensing, Vol. 48, No. 8, August 1982, pp. 1309-1312.

Aronoff, Stan, “The Minimum Accuracy Value as an Index of Classification Accuracy,” Photogrammetric Engineering and Remote Sensing, Vol. 51, No. 1, January 1985, pp. 99-111.

ASPRS Specifications and Standards Committee, “ASPRS Accuracy Standards for Large Scale Maps,” 1990, as summarized in the Federal Geographic Data Committee’s report # FGDC-STD-007.3-1998 on Geospatial Positional Accuracy Standards, Part 3: National Standard fo r Spatial Data Accuracy, Base Cartographic Data, found at URL http://fgdc.gov/standards/documents/standards/accuracy/chapter3.pdf. 1.

Atkinson, Peter M., “Spatial Statistics,” from Spatial Statistics for Remote Sensing, edited by Alfred Stein, Freek van der Meer, and Ben Gorte, Dordrecht: Kluwer Academic Publishers, 1999.

Baddeley, Adrian, “A Crash Course in Stochastic Geometry,” University of Western Australia, URL http://maths.uwa.edu/au/stgeom, accessed June 2001.

Belward, A S., Estes, J.E., and Kline, K.D., “The IGBP-DIS Global 1-Km Land-Cover Data Set DISCover: A Project Overview,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1013-1020.

297 Booch, Grady, Object-Oriented Analysis and Design with Applications: Second Edition, Reading, Massachusetts: Addison-Wesley, 1994.

Briscoe, Garry and CaelU, Terry, A Compendium o f Machine Learning. Volume 1: Symbolic Machine Learning. Norwood, New Jersey: Ablex Publishing Company, 1996.

Byrd Polar Research Center, UHL http/www.bprc,mps.ohio-state.edu. “Radarstat-1 Antarctic Mapping Project,” accessed 4-13-01.

Ceccarelli, Michele, and Petrosino, Alfredo, “Multi-feature adaptive classifiers for SAR image segmentation,” Neurocomputing 14 (1997) 345-363.

Chen, K. S., Huang, W. P., Tsay, D. H., and Amar, F., “Classification of Multiffequency Polarimetric SAR Imagery using a Dynamic Learning Neural Network,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 34, No. 3, May 1996, pp. 814- 820.

Cloude, S.R. and Pottier, E., “An Entropy Based Classification Scheme for Land Applications of Polarimetric SAR,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 68-76.

Coltelli, Mauro; et al, “SIR-C/X-SAR multiffequency multipass interferometry: a new tool for geological interpretation,” Journal o f Geophysical Research, 10/25/96, Vol. 101, Iss. ElO, pp. 23127-48.

Coulson, S.N., “SAR Interferometry with ERS, ” Earth Space Review, Vol. 5, No. 1, 1996, pp. 9-16.

Croarkin, C., Measurement Assurance Programs. Part II: Development and Implementation. NBS Special Publication 676-11, Washington, D C.: U.S. Government Printing Office, 1985.

Curlander, John C., Caranda, R., and Rosen, P., “Short Course on Interferometric Synthetic Aperture Radar: Theory and Applications,” held at The Ohio State University on November 20-22, 1996.

De Boissezon, H., Gonzales, G., Pous, B., and Sharman, M., “Rapid Estimates of Crop Acreage and Production at a European Scale Using High Resolution Imagery— Operational Review,” Proceedings of the International Symposium on F 1993, International Institute for Aerospace Survey and Earth Sciences, Enschede, 2:94-105.

Defense Advanced Research Projects Agency (DARPA), “Image Understanding for Battlefield Awareness,” Broad Agency Announcement (BAA) 96-14, issued September 12,1996, Section C.

298 DeFries, R.S. and Los, S O., “Implications of Land-Cover Misclassification for Parameter Estimates in Global Land-Surface Models: An Example from the Simple Biosphere Model (SiB2),” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1083-1088.

Diamond, William J., Practical Experiment Designs for Engineers and Scientists. Belmont, CA: Lifetime Learning Publications, 1981.

Dong, Y., Milne, A.K., Forster, B.C., “Segmentation and Classification of Vegetated Areas Using Polarimetric SAR Image Data,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 2, February 2001, pp. 321-329.

Ed-Rayes, M.A. and Ulaby, F T., Microwave Dielectric Behavior of Vegetation Material. Report No. 022132-4-T. Radiation Laboratory of the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, January 1987 (Contract NAG 5-480 from NASA/Goddard Space Flight Center, Greenbelt, Maryland).

Estes, J., Belward, A., Loveland, T., et al., “The Way Forward,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1089- 1093.

Farr, Tom et al, “Mission in the Works Promises Precise Global Topographic Data.” EOS Transactions, American Geophysical Union, Vol. 76, No. 22, May 3, 1995.

Federal Geographic Data Committee, “Status of FGDC Standards as of June 15, 2001,” URL http://www.fgdc.gov/standards/textstatus.html.

Federal Geographic Data Committee, Vegetation sub-committee FGDC-STD-005: Vegetation Classification Standard. June 1997.

Federal Geographic Data Committee, "Content Standard for Digital Geospatial Metadata (CSDGM),” http://www.fgdc.gov/metadata/contstan.html.

Federal Geographic Data Committee, Content Standard for Digital Geospatial Metadata - FGDC-STD-001-1998, URL http://www.fgdc.gov/standards/status/csdgm rs ex.html.

Federal Geographic Data Committee, Standard Workings Group, Content Standard for Digital Geospatial Metadata: Extensions fo r Remote Sensing Metadata (Public Review Draft), December 21,2000.

Feynman, Richard, Cargo Cult Science, adapted from a Caltech commencement address given in 1974, from the book Surelv You’re Joking. Mr. Fevnman!. URL http://pc65.frontier.osrhe.edu/hs/science/fevnman.htm. 299 Filho, Otto; Treitz, Paul; et al, “Texture Processing of Synthetic Aperture Radar using Second-Order Spatial Statistics,” Computers and GeoSciences, Vol. 22, No. 1, pp. 27- 34, 1996.

Fitzpatrick-Lins, Katherine, “Comparison of Sampling Procedures and Data Analysis for a Land-Use and Land-Cover Map.” Photogrammetric Engineering and Remote Sensing, Vol. 47, No. 3, March 1981, pp. 343-351.

Foody, Giles M.; McCulloch, Mary B.; and Yates, William B., “Classification of Remotely Sensed Data by an Artificial Neural Network: Issues Related to Training Data Characteristics,” Photogrammetric Engineering and Remote Sensing, Vol. 61, No. 4, pp. 391-401, April 1995.

Franceschetti, G. and lodice. A., “The Effect of Surface Scattering on BFSAR Baseline Decorrelation,” Journal o f Electromagnetic Waves and Applications, Vol. 11, 1997, pp. 353-370.

Freeman, A., Chapman, B., and Alves, M. MAPVEG Software User’s Guide, JPL Document D-11254, Jet Propulsion Laboratory, October 1993.

Freeman, A. and Durden, S.L., “A Three-Component Scattering Model for Polarimetric SAR Data,” a report of the Jet Propulsion Laboratory, California Institute of Technology.

Fulghum, David A., “DARPA Looks Anew at Hidden Targets,” Aviation Week & Space Technology, January 6, 1997, pp. 56-57.

Gamba, Paolo, and Houshmand, Bigan, “Digital Surface Models and Building Extraction: A Comparison of IFSAR and LIDAR Data,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 4, July, 2000, pp. 1959-1968.

Gamba, P., Houshmand, B., and Saccani, M., “Detection and Extraction of Buildings from Interferometric SAR Data,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 1, January 2000, pp. 611-775.

Gary, L. and Farris-Manning, P., “Repeat-Pass Interferometry with Airborne Synthetic Aperture Radar,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 1, January 1993, pp. 180-191.

Genderen, J. L. van, “Testing Land-Use Map Accuracy,” Photogrammetric Engineering and Remote Sensing, Vol. 43, No. 9, September 1977, pp. 1135-1137

Genderen, J. L. van, and Lock, B. F., “A Methodology for Producing Small Scale Rural Land Use Maps in Semi-Arid Developing Counties Using Orbital M.S.S. Imagery,” Final Contractor’s Report—NASA—CR-151173. September 1976. 300 Ginevan, Michael E., “Testing Land-Use Map Accuracy: Another Look,” Photogrammetric Engineering and Remote Sensing, Vol. 45, No. 10, October 1979, pp. 1371-1377.

Golden, Borup, Cheney, et. al, “Inverse Electromagnetic Scattering Models for Sea Ice,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 5, September 1998, pp. 1675-1704.

Green, David M., and John A., Signal Detection Theory and Psychophysics, New York: John Wiley & Sons, Inc. 1966.

Guide Quantifying Uncertainty in Analytical Measurement. Second Edition. Eurachem, St. Gallen: EMPA, c. 2000/2001, URL http://www.measurementuncertainty.org/mu/guide/index.html.

Guptill, Stephen C. and Morrison, Joel L., Editors, Elements of Spatial Data Quality. Oxford, U.K.: Elsevier Science Ltd., published on behalf of the International Cartographic Association, 1995.

Hagberg, J.O. and Ulander, L.M., “On the Optimization of Interferometric SAR for Topographic Mapping,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 1, January 1993, pp. 303-306.

Hales, Stephen, His Vegetable Staticks (1727), from http://www.english.upenn.edu/~iIynch/Frank/People/hales.html.

Hara, Y., Atkins, R. G., Yueh, S. H., Shin, R. T ., and Kong, J. A., “Application of Neural Networks to Radar Image Classification,” IEEE Transactions on Geoscience and Remote Sensing. Vol. 32, No. 1, January 1994, pp.100-109.

Henderson, Floyd and Lewis, Anthony, Editors, Principles and Applications of Imaging Radar. Manual of Remote Sensing. Third Edition. Volume 2. New York,: John Wiley & Sons, Inc., 1998.

Henderson, Floyd M., and Xia, Zong-Guo, “Understanding the Relationships between Radar Response Patterns and the Bio- and Geophysical Paramaters of Urban Areas,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 93-101.

Henderson, Floyd, “SAR Applications in Human Settlement Detection, Population Estimation and Urban Land Use Pattern Analysis: A Status Report,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 79-85.

301 Hirosawa, Haruto, “Degree of Polarization of Radar Backscatters from a Mixed Target,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 2, March 1997, pp. 466-470.

Hopcraft, K.I. and Smith, P R., An Introduction to Electromagnetic Inverse Scattering. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1992.

Hord, R. M., and Brooner, W., “Land-Use Map Accuracy Criteria,” Photogrammetric Engineering and Remote Sensing, Vol. 42, No. 5, May 1976, pp. 671-677.

Husak, G.J., Hadley, B.C., and McGwire, K.C., “Landsat Thematic Mapper Registration Accuracy and its Effects on the IGBP Validation,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1033-1040.

International Organization of Standardization. Guide to the Expression of Uncertainty in Measurement, a collaborative international effort by the International Bureau of Weights and Measures (BIPM), the International Electrotechnical Commission (lEC), the Intemational Federation of Clinical Chemistry (EFCC), the International Organization of Standardization (ISO), the Intemational Union of Pure and Applied Chemistry (lUPAC), the Intemational Union of Pure and Applied Physics (lUPAP), and the Intemational Organization of Legal Metrology (OIML), 1995.

Irving, William W., and Novak, Leslie M. “A Multiresolution Approach to Discrimination in SAR Imagery,” IEEE Transactions on Aerospace and Electronic Systems, Vol. 33, No. 4, October 1997, pp. 1157-1168.

ISO/TC 211, Draft Intemational Standards ISO/DIS 19113: Geographic Information — Quality Principles, February 22, 2001, URL http://www.statkart.no/isotc211.

ISO/TC 211, Draft review summary from stage 0 o f project 19124: Geographic Information — Imagery and Gridded Data Components, December 1, 2000, URL http://www.statkart.no/isotc211 (restricted access).

Ito, Yosuke and Omatu, Sigeru, “Land Cover Mapping Method for Polarimetric SAR Data.” SPIE Proceedings 3070.1997, pp. 388-397.

Jensen, John R. Introductory Digital Image Processing: A Remote Sensing Perspective. Second Edition. Upper Saddle River, New Jersey: Prentice Hall, 1996, pp. 247-262.

Joughin, Ian R., Winebrenner, Dale P., and Percival, Donald B., “Probability Density Functions for Multilook Polarimetric Signatures,”IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 3, May 1994, pp. 562-574.

JPL Publication 96-16. Operational Use of Civil Space-Based Svnthetic Aperture Radar. prepared by the interagency ad hoc working group on SAR, August 21, 1996.

302 Kelly, M., Estes, JÆ., and Knight, K.A., “Image Interpretation Keys for Validation of Global Land-Cover Data Sets," Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1041-1050.

Kimmins, J.P., Forest Ecology: A Foundation for Sustainable Management. Second Edition. Upper Saddle River, New Jersey: Prentice-Hall, 1997.

Klein, Lawrence A., Sensor and Data Fusion Concepts and Applications. Second Edition. Bellingham, Washington: SPIE—The Intemational Society for Optical Engineering, Volume TT35, 1999.

Kononenko, Igor, and Bratko, Ivan. ‘Tnformation-Based Evaluation Criterion for Classifier’s Performance,” Machine Learning, 6(l):67-80, 1991, pp. 68-80.

Kuhn, Thomas S., The Structure of Scientific Revolutions. Third Edition. Chicago and London: The University of Chicago Press, 1996.

Lee, H. and Liu, J. G., “Analysis of Topographic Decorrelation in SAR Interferometry using Ratio Coherence Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 2, February 2001, pp. 223-231.

Lee, J.S., Hoppel, K.W., et al., “Intensity and Phase Statistics of Multilook Polarimetric and Interferometric SAR Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 5, September 1994, pp. 1017-1028.

Lim, H.H.; Swartz, A.A.; Yueh, H.A.; Kong, J.A.; and Shin, R.T., “Classification of Earth Terrain using Polarimetric Synthetic Aperture Radar Images,” Journal o f Geophysical Research, Vol. 94, No. B6, pp. 7049-7057, June 10,1989.

Liu, Gouqing; Huang, Shunji; et al., “Bayesian Classification of Multilook Polarimetric SAR Images with Speckle Model,” SPIE Vol. 3070.1997, pp. 398-405.

Lorrain, P., Corson, D., and Lorain, F. Electromagnetic Fields and Waves: Third Edition. New York: W.H. Freeman and Company, 1988.

Loveland, T.R., Zhu, Z., Ohlen, D.O., et.al, “An Analysis of the IGBP Global Land- Cover Characterization Process,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1021-1032.

Loveland, T.R., Brown, J.F., Ohlen, D O., and Zhu, Z., “The Global Land-Cover Characteristics Database: The Users’ Perspective,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1069-1074.

Loveland, T. R., Estes, J. E., and Scepan, J., “Introduction,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1011- 1012. 303 Luckman, Adrian J., “Correction of SAR Imagery for Variation in Pixel Scattering Area Caused by Topography,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 1, January 1998, pp. 344-350.

Mallows, Colin C., Design, Data, and Analysis. New York, New York: John Wiley & Sons, 1987.

Mapping Science Committee, Board on Earth Sciences and Resources, Commission on Geosciences, Environment, and Resources, National Research Council, A Data Foundation for the National Spatial Data Infrastructure. Washington, D C.: National Academy Press, 1995.

Marr, David, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: W.H. Freeman and Company, 1982.

McGrew, J. Chapman, Jr., and Monroe, Charles B., An Introduction to Statistical Problem Solving in Geography. Dubuque, Iowa: William C. Brown Publishers, 1993.

McMaster, Robert B., and Shea K Stuart, Generalization in Digital Cartography. Washington, D. C. Association of American Cartographers, 1992.

Michalski, R. and Stepp, R., Learning from Observation: Conceptual Clustering, in Machine Learning: An Artificial Intelligence Approach. (Ed.) R. Michalski, J. Carbonell, andT. Mitchell. Palo Alto: CA: Tioga, 1983.

Mitchell, Tom M. Machine Learning. New York: McGraw-Hill Companies, Inc., 1997.

Mitchell, T., Moran, M S., Vidal, A., Troufleau, D.,and Inoue, Y., “Ku- and C-Band SAR for Discriminating Agricultural Crop and Soil Conditions,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 1, January 1998, pp. 265-272.

Morain, Stanley A., PE&RS — Photogrammetric Engineering and Remote Sensing, Special Issue: Global Land Cover Data Set Validation, Volume 65, Number 9, September 1999, pp. 1011-1093.

Morrison, Joel L.and Guptill, Stephen C., Editors, Elements of Spatial Data Oualitv. Oxford, U.K.: Elsevier Science Ltd., published on behalf of the International Cartographic Association, 1995.

Mrstik, V., VanBlaricum, G., Cardillo, G., and Fennell, M., ‘Terrain Height Measurement Accuracy of Interferometric Synthetic Aperture Radars,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 34, No. 1, January 1996, pp. 219-228.

304 Muchoney, D., Strahler, A., Hodges, J., and LoCastro, J., ‘The IGBP DlSCover Confidence Sites and the System for Terrestrial Ecosystem Parameterization: Tools for Validating Global Land-Cover Data,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1061-1068.

Nair, Dinesh, Mitiche, Amar, and Aggarwal, J. K. “On Comparing the Preformance of Object Recognition Systems,” Proceedings of the Second International IEEE Conference on Image Processing. October 23-26.1995. Washington. D C. Los Alamitos, CA: IEEE Computer Society Press, pp. 631-634.

Pairman, D., Beiliss, S.E., and McNeill, S.J., ‘Terrain Influences on SAR Backscatter around Mt. Taranaki, New Zealand,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 4, July 1997, pp. 924-932.

Polidori, L., Caillault, S., and Canaud, J.-L., “Change Detection in Radar Images: Methods and Operational Constraints,” Proceedings IGARSS '95. Firenze, Italy, July 1995, pp. 1529-1531.

Popper, Karl R., Conjectures and Refutations: the Growth of Scientific Knowledge. Second Edition. New York: Basic Books, Inc., 1965.

Raghu, P. P. and Yegnanarayana, B. “Multispectral Image Classification using Gabor Filters and Stochastic Relaxation Neural Network,” Neural Networks, Vol. 10, No. 3, 1997, pp. 361-472.

Rich, Elaine, and Knight, Kevin. Artificial Intelligence. Second Edition. New York: McGraw-Hill, Inc., 1991, 621 pages.

Richards, John A. Remote Sensing Digital Image Analysis: An Introduction. Berlin: Springer-Verlag, 1986, New York, New York: John Wiley & Sons, 1984.

Rignot, E., and Chellappa, R. “Segmentation of synthethic-aperture-radar complex data,” Journal Optical Society o f America A, Volume 8, No. 9, September 1991, pp. 1499- 1509.

Rignot, Eric, and Chellappa, Rama. “Segmentation of Polarimetric Synthetic Aperture Radar Data,” IEEE Transactions on Image Processing, Vol. 1, No. 3, July 1992, pp.281-300.

Robinson, Arthur H., Sale, Randall D., Morrison, Joel L., and Muehrcke, Phillip C., Elements of Cartography, Fifth Edition. New York: John Wiley and Sons, 1984.

Rodriguez, E., Imel, D., and Madson, S.N., ‘The Accuracy of Airborne Interferometric SARs,” IEEE Transactions on Aerospace and Electronic Systems, 1995.

305 Rodriguez, E. and Martin, J.M., “Theory and Design of Interferometric Synthetic Aperture Radars,” IEEE Proceedings — F, Vol. 139, Number 2, April 1992, pp. 147- 159.

Rosentield, George H., and Fitzpatrick-Lins, Katherine, “A Coefficient of Agreement as a Measure of Thematic Classification Accuracy,” Photogrammetric Engineering and Remote Sensing, Vol. 52, No. 2, February 1986, pp. 223-227.

RSMAS Technical Report TR95-003. SAR Interferometrv and Surface Change Detection. Report of a workshop held in Boulder, Colorado, February 3-4, 1994. Published July 1995 by the University of Miami Rosenstiel School of Marine and Atmospheric Science, pp. 3-8.

Rudd, R. D., “Macro Land-Use Mapping with Simulated Space Photographs,” Photogrammetric Engineering, Vol. 37,1971, pp. 365-372.

Santrock, John W., Adult Development and Aging. Dubuque, Iowa: William C. Brown Publishers, 1985.

Scapen, J., Menz, G., and Hansen, M.C., ‘The DlSCover Validation Image Interpretation Process,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1075-1082.

Scapen, Joseph, “Thematic Validation of High-Resolution Global Land-Cover Data Sets,” Photogrammetric Engineering and Remote Sensing, Volume 65, Number 9, September 1999, pp. 1051-1060.

Schanda, Erwin, Phvsical Fundamentals of Remote Sensing. Berlin, Germany: Springer- Veriag, 1986.

Schistad-Solbert, A H. and Jain, A.K., ‘Texture Fusion and Feature Selection Applied to SAR Imagery.” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 2, March, 1997, pp.475-479.

Schott, John R., Remote Sensing: The Image Chain Approach. New York: Oxford University Press, 1997.

Schowengerdt, Robert A., Remote Sensing Models and Methods for Image Processing: Second Edition. San Diego: Academic Press, 1997.

Sears, F.W., Zemansky, M.W., and Young, Hugh, University Phvsics. Sixth Edition. Reading, MA: Addison-Wesley Publishing Company, 1982.

Sheen, Dan R., “Statistical and Spatial Properties of Forest Clutter Measured with Polarimetric Synthetic Aperture Radar,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, No. 3, May 1992, pp. 578-588. 306 Skokr, M £., Wilson, L.J., and Surdu-Miller, DX., “Effect of Radar Parameters on Sea Ice Tonal and Textural Signatures using Multi-Frequency Polarimetric SAR Data,” Photogrammetric Engineering and Remote Sensing, Vol. 61, No. 12, December 1995, pp. 1463-1473.

Smits, Paul C., and Dellepiane, Silvana G. “Synthetic Aperture Radar Image Segmentation by a Detail Preserving Markov Random Field Approach,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 4, July 1997, pp. 844- 857.

Solberg-Schistad, ATI. and Jain, A.K., “Texture Fusion and Feature Selection Applied to SAR Imagery,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 2, March 1997, pp. 475-479.

Sperry, Roger, and Donahue, Arnold, “Geographic Information for the 21®' Century,” GEOWorld, October 1999, pp. 34-35.

Standards Working Group, Federal Geographic Data Committee, Content Standard for Digital Geospatial Metadata: Extensions for Remote Sensing Metadata (Public Review Draft), December 21, 2000, pp. v-vi, and lines 371-386, found by selecting this item at URL http://www.fgdc.gov/standards/status/csdgm rs ex.html. 2 pages,

Stevenson, Paula J., Literature Review on SAR for Mapping and Feature Extraction (Tart I), an internal report indexing over 500 relevant books, articles and papers on SAR/InSAR, Center for Mapping at The Ohio State University, Columbus, Ohio, August 1998, (updated June 2(X)1).

Stewart, C.V., B. Moghaddam, K. J. Hintz, and L. M. Novak, “Fractional Brownian Motion Models for Synthetic Aperture Radar Imagery Scene Segmentation,” Proceedings of the IEEE, Vol. 81, No. 10, October 1993, pp. 1511-1522.

Stobbs, A. R., “Some Problems of Measuring Land Use in Underdeveloped Counties: the Land Use Survey of Malawi,” Cartographic Journal, Vol. 5, 1968, pp. 107-110.

Strozz, T., Dammert, P., Wegmuller, U , et al., “Landuse Mapping with ERS SAR Interferometry,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 2, March 2000, pp. 766-775.

Taylor, Barry N. and Kyatt, Chris E., NIST Technical Note 1297. Guidelines for Evaluating and Expressing the Uncertaintv of NIST Measurement Results. Physics Laboratory of the National Institute of Standards and Technology, Gaithersburg, Maryland, September 1994.

307 Tomlinson, R J., Geographical Data Handling. Volume 1: Environment Information Systems. Symposium Edition, TJNESCO/IGU Second Symposium on Geographic Information Systems, Ottawa, August 1972.

Toomay, J.C., Radar Principles for the Non-Specialist: Second Edition. Mendham, New Jersey: SciTech Publishing, Inc., 1998.

Treuhaft, Robert N. and Siqueira, Paul R., “Vertical Structure of Vegetated Land Structures from Interferometric and Polarimetric Radar,” Radio Science, Volume 35, Number 1, January-February 2000, pp. 141-177.

Tuell, Grady H., “The Use of High Resolution Airborne SAR for Shoreline Mapping,” from Object Recognition and Scene Classification from Multispectral and Multisensor Pixels, addendum to the Proceedings of the ISPRS Commission m Symposium, held July 6-10,1998, in Columbus, Ohio.

Tzeng, Y.C. and Chen, K.S. “A Fuzzy Neural Network to SAR Image Classification, IEEE Transactions on Geoscience and Remote Sensing, Vol. 36, No. 1, January 1998.

Ulaby, F T. and Elachi, C., Editors, Radar Polarimetrv for Geoscience Applications. Norwood, MA: Artech House, Inc., 1990.

Ulaby, F T., Moore, R.K., and Fung, A.K., Microwave Remote Sensing: Active and Passive. Volume U: Radar Remote Sensing and Surface Scattering and Emission Theory. Reading, MA: Addison-Wesley Publishing Company, 1982.

U.S. Geological Survey, “What is SDTS?, URL http://www.mcmcweb.er.usgs.gov/sdts/whatsdts.html.

U.S. Geological Survey, “SDTS: Spatial Data Transfer Standard - Part 1,” http://mcmcweb.er.usgs.gov/sdts/SDTS standardnov 97/partlbll.html.

URL http://www.fgdc.gov/nsdi/nsdi.html.

URL http://madsci.wustl.edu/posts/archives/apr99/9251574I8.Sh.r.html.

URL http://www.alaska.edu/user serv/amplitude.html. example from ASF ERS-1 SAR image # 8154400, accessed 7/12/01.

URL http://www.asf.alaska.edu/calval/old cr.gif. accessed 7/12/01.

URL http://www.alaska.edu/daac documents/cdrom images/60769200.gif, accessed 7/12/01.

URL http://engineering.uow.edu.au/Courses/Stats/File32.html, accessed 7/22/01.

308 URL http://www.ptw.de/ptw htm/service/download/speciF.pdf. Accessed 7/22/01.

URL http://www.nata.asn.au/librarv/sounest.html, accessed 7/22/01

URL http://www.mintek.ac.za/ASD/sarms.html. accessed 7/22/01.

URL http://www.fgdc.gov/standards/status/textstatus.html. accessed 7/31/01.

URLhttp://www.fgdc.gov/standards/documents/standards7metadata/v2 06986/30/01.

URL http://fgdc.er.usgs.gov/framework/overview.html. accessed 7/26/01

URL http://esa.sdsc.edu/initv60.html accessed 7/22/01

URL http://www-bprc.mps.ohio-state.edu. (Byrd Polar Research Center) “Radarsat-1 Antarctic Mapping Project,” accessed 4-13-01.

URL http://random.mat.sbg.ac.at./~ste/dipl/node7.html. accessed 7/21/01.

URL http://sl-proi-bi-specification. web.cem .ch/sl-proi - bi=specification/Activities/Glossarv/glossarv vl.pdf. accessed 7/22/01.

URL http://www.kepcower.eom/gl.htm#l, accessed 7/22/01.

van Zyl, J.J., Burnette, C.F., and Farr, T.G., “Inference of Surface Power Spectra from Inversion of Multiffequency Polarimetric Radar Data,” Geophysical Research Letters, Vol. 18, No. 9, September 1991, pp. 1787-1790.

van Zyl, J.J., Chapman, B.D., Dubois, P., and Shi, J., “The Effect of Topography on SAR Calibration,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 5, September 1993, pp. 1036-1043.

Vegetation Classification Panel of the Ecological Society of America, Review Draft V. 6.0, July 2000, “An Initiative for a Standardized Classification of Vegetation in the United States,”, accessed 7/30/01.

Vegetation Subcommittee, Federal Geographic Data Committee, FGDC-STD-005; Vegetation Classification Standard. June. 1997.

Vexcel Corporation, UHL http://www.vexcel.com/proi/ifsar/html. Accessed 10/22/98.

Waite, W.P., “Historical Development of Imaging Radar, Geoscience Applications of Imaging Radar Systems, RSEMS (Remote Sensing of the Electro Magnetic Spectrum, A.J. Lewis, Ed.), Association of American Geographers, 3(3): 1-22, 1976.

309 Wegmuller, Urs, and Werner, Charles. “Retrieval of Vegetation Parameters with SAR Interferometry,” ŒEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp. 18-24.

Wegmuller, Urs, and Wemer, Charles L. “SAR Interferometric Signatures of Forest,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 33, No. 5, September 1995, pp. 1153-1161.

Wessel, Paul, “Uncertainty in Derived Quantities,” URL http://www.soest.hawaii.edu/wessel/courses/gg313/DA book/node23.htm. created 12/7/2000 and accessed 7/1/2001.

Williams, C.L., Rignot, L. A.,. McDonald, K., Viereck, J.B. ,Way and Zinmiermann, R. “Monitoring, classification, and characterization of interior Alaska forests using AIRSAR and ERS-1 SAR,” Polar Record 31(111), 1994, pp. 227-234.

Winston, Patrick Henry. Artificial Intelligence. Third Edition. Reading, MA: Addison- Wesley Publishing Company, 1993.

Wong, Yiu-fai and Posner, Edward C. “A New Clustering Algorithm Applicable to Multispectral and Polarimetric SAR Images,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 31, No. 3, May 1993, pp. 634-644.

Xia, Zong-Guo and Henderson, Floyd M. “Understanding the Relationships between Radar Response Patterns and the Bio- and Geophysical Parameters of Urban Areas.” IEEE Transactions on Geoscience and Remote Sensing, Vol. 35, No. 1, January 1997, pp.79-85.

Zebker, H. A, Hensley, S., and Rosen, P.A., “Atmospheric Effects in Interferometric Synthetic Aperture Radar Surface Deformation and Topographic Maps,” Journal o f Geophysical Research, Vol. 102, Issue B4,4/10/97, pp. 7547-7563.

Zebker, H,A. van Zyl, J.,. and Elachi, C., “Imaging Radar Polarization Signatures: Theory and Observation,” Radio Science, Volume 22, Number 4, July-August 1987, pp. 529-543.

Zebker, H.A. and Villasenor, J., “Decorrelation in Interferometric Radar Echoes,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 30, No. 5, September 1992, pp. 950-959.

Zebker, H.A., Wemerm C. Rosen, P.A. and Hensley, S., “Accuracy of Topographic Maps Derived from ERS-1 Interferometric Radar, IEEE Transactions on Geoscience and Remote Sensing, Vol. 32, No. 4 , 1994, pp. 823-836.

310 Zebker, H.A., Rosen, P.A., and Hensley, S., “Atmospheric Effects in Interferometric Synthetic Aperture Radar Surface Deformation and Topographic Maps,” Journal o f Geophysical Research, Volume 102, Issue B4, April 10, 1997, pp. 7547-7563.

Zonneveld, I. S., “Aerial Photography, Remote Sensing and Ecology,” 1TC Journal, Part 4,1974, pp. 553-560.

311