Image Understanding Using Artificial Intelligence Technology

Dave Kargel Geographer

Will Brandt Engineer PRC, Inc. 111 West Port Plaza, Suite 327 St. Louis, MO63146

Abstract 1. Introduction

An artificial neural network The PRC Image Understanding Using approach was evaluated in multispectral Artificial Intelligence Technology image processing applications, including Independent Research and Development general land cover classification and land project, initiated in 1991, is presently use feature identification. The evaluating the utility of neural network and performance of an artificial neural network expert system technologies in management, was compared to that of a standard classification and feature extraction statistical classification technique functions related to multispectral images. available from a commercial image The long term goal will be to build a processing package (ERDAS). Landsat prototype capable of handling numerous Thematic Mapper 30 meter multispectral types of images and performing the various data of St. Louis, Missouri were utilized for functions and analyses now performed the classification. Seventeen distinct land manually. cover types were identified in the St. Louis study area. Supervised classifications As users of geographic data better were performed with both the neural understand the value of remote imagery, the network and the standard statistical demand for automated image processing and methods. The results show that the feature extraction capabilities increases. neural network is more responsive to Many govemment agencies and private firms cultural material with skewed spectral have poured money into research and distributions and sub-pixel spatial development efforts addressing image resolution (e.g. asphalt roads). The neural understanding applications of particular network’s probabilistic output also enables concern to them. Many of the applications discrepancy resolution capabilities for the currently utilize techniques combining higher level task of land use feature computer manipulation and human identification. A model was developed to interpretation of the imagery. identif~l roads from the multispectral c/ass/.flcations. The neural network based Currently, the application of spectral model performed much better than the and spatial enhancement functions, model based on the standard statistical derivation of spectral signature ranges, and classification technique. coordinate transformations for

78 georeferencing are all performed by computer different application query received. The algorithms., However, the choice of computer images must be inventoried by the query algorithms is still dependent on the direct criteria used to select images for each interface with a professional image application. This synergistic set of criteria interpreter who is knowledgeable of the user includes spectral resolution, spatial application and the corresponding relevance resolution, temporal issues and the of computer algorithms. georeferencing system and of the inventoried imagery. Artificial intelligence can be used to We are convinced that the task of encode the knowledge base and decision selecting the imagery required for an rules implemented by professional image application can be most effectively interpreters. Many of the repetitive automated by implementing an expert decisions made in image interpretation may system based approach. The expert system be delegated to artificial intelligence based is given a rule base and knowledge set automation. The added automation will incorporating the decision process of the increase the speed, precision, accuracy and professional image interpreter. This adaptability of the image interpretations. selection process will allow the automated The resulting data will be cheaper in terms image understanding system to be nearly as of time, money, and necessary expertise. flexible as a human interpreter. Figure 1 shows a simplistic expert system This report contains three sections. architecture. The following section discusses the approach we have taken in automating the image 2.2 Image Georeferencing understanding task. This includes a specific discussion of critical areas in the The inclusion of an automated image image understanding task and how they may georeferencing process will allow proper be automated. The next section reports the image registration when an application calls most resent research results. The results for a specific coordinate system. The are divided based on the critical areas georeferencing process will be applied as an addressed. The final section contains our automated point matching algorithm from a conclusions. newly acquired image to a previously georeferenced image stored in inventory 2. Automation Methodology [TON89; PERL89]. An expert system can then apply the proper transformation 2.1 TmageInventory process and select any coordinate system required for an application. The An image inventory capability is an georeferencing information for an image can essential part of an automated image be stored automatically in the image understanding system. A properly indexed inventory. This will facilitate the image inventory system allows the correct georeferencing and the image selection images to be chosen from inventory for each processes.

Hypothqmls

spectral resolution Is ’visible ~ IR’ ~/\ ~,~ spatial_resolution Is ’30 ]4eturs’ ~/~

georeferencing Is ’U~ Zone 15’ M---~r.7 it, ’I’M_Imagerq temporal Is ’$eptesber 9, 1989’ ~/// ->Execute~/ ’Dtspiay Hit Image’(~TYPE-EXE?

"~"~ EvaluaUon icons

Figure I. Expert system architecture

79 2.3 Spectral Image Processing These features give neural networks a fault tolerant behavior which is generally referred PRC will automate the spectral to as "graceful degradation". Graceful classification process using an integrated degradation allows such systems to operated neural network and expert system based successfully in noisy environments solution. While an expert system is an [BISC92], such as, a mountainous area with artificial intelligence architecture which associated areas of bright reflectance and represents expert knowledge as an explicit shadows. A neural network is also able to rule set and knowledge base, a neural provide classifications with associated network architecture represents expert confidence levels for all possible material knowledge as a generalization of examples classes (e.g., water-55%, asphalt-45%) presented to it. This is beneficial when the [NEUR91]. This can be useful information development of explicit rules would be too for automating the problem of discrepancy complex or constricting. PRC will integrate resolution in pixel classification. For these two artificial intelligence architectures example, if an area is identified as asphalt into a synergistic spectral classification but includes a water pixel in the middle, it solution. may be analyzed for discrepancies. If the classification probabilities for the water pixel The use of neural network technology are equal to the ones shown above, the 45% will improve the classification and the asphalt classification will be accepted as the interpretation of spectral signatures. Figure true classification. 2 shows a very simplistic neural network architecture. A neural network architecture consists of a large network of simple processing elements or nodes, represented by circles, which process information in 40mpm U~ response to examples. The input layer can accept pixel values from different spectral bands of raw satellite imagery. The hidden ddenUn/m layer stores features it learns from the input layer. Finally, the output layer contains nodes representing landuse classes into ~ which the raw pixel values will be classified. The knowledge in a neural network is Band2 l~nd7 distributed throughout the network in the form of internode connections and weighted links which form the inputs to the nodes. The link weights serve to enhance or inhibit Figure 2. Neural network architecture the input stimuli values which are then added together at the nodes. If the sum of all the inputs to the node exceeds some An expert system will be used to threshold value, the node executes and optimize the neural networks classifying the produces an output which is passed on to pixels. By referring to the information stored other nodes or is used to produce some in the image inventory, the expert system output response [NEUR91; CARP92; can allow the neural network to compensate JONE87]. its signatures for different imaging Neural network classification conditions. For example, an image taken at techniques provide several useful noon on a sunny, summer day will have advantages over standard classification brighter streets than an image taken at dawn techniques. Unlike the standard on a rainy, winter day. This will allow for classification techniques, which attempt to complete automation of the spectral fit each classification scheme into a standard classification process by incorporating into a distribution curve, a neural network is able system, the flexibility associated with the to handle classes with skewed distributions. human interpreter.

8O Such a system will also be equipped generalizes objects into a predetermined grid to learn signatures for materials it was not format. In the vector format, the expert trained for. The system operator will follow system is able to build features based on the steps for standard supervised rules of topology and areal composition, classifications by identifying representative such as shape and texture [HARW87; pixels in the image from each target class. HARV92]. These types of rules can Then using the information stored in the distinguish between an orchard and a image inventory and the expert system, the woodland, with an orchard having a uniform system will build an image independent pattern of trees and grassy areas and a spectral signature for each target class woodland having a more random pattem. compensating for the atmospheric conditions of the imagery. These signatures can then be included in the fully automated spectral classification portion of an image understanding system. NEURAL NETWORK IMAGE PROCESSOR

2.4 Feature Extraction

The automation of the feature ¯ 1 identification process involves the [ I [ ] integration of the spectral classification information and low-level spatial feature detection. The low-level spatial features, -...j such as lines, edges, and regions, will be EXPERT SYSTEM detected by corresponding spatial image processing algorithms. Many such general low-level spatial image processing algorithms have been developed and proven successful [THOM87; FUA87; TON89]. The particular High-levelFeature ID ] features to be detected will be decided by the expert system based on the present application query. Figure 3. Expert system integraUon of The expert system will then initiate a spectral and spatial experts high-level feature identification process. This process, illustrated in Figure 3, will integrate the material classification 3. Image Understandlng Research information for individual pixels and the Activities low-level feature groups they correspond to [HUER87]. For example, an asphalt pixel 3.1 Spectral Image Processing corresponding to a linear feature is identified aa a road. This can be handled by a Our research in the automation of technique called raster modelling which spectral image processing examined the use combines different data sources and derives of a neural network for the classification of new classifications for pixels based on a water, pavement, and vegetation from a structured spatial processing language September 9, 1989 Landsat Thematic [TOML90]. Mapper (TM) 30 meter multispectral image To identify the more complex high- St. Louis, Missouri, USA. The spectral level features, such as airports, it will be classification was automated with a uniquely useful for the expert system to convert the configured back-propagation neural network simpler features to vector format. In the that was developed using NeuralWorks vector format, data is stored as discrete Professional If/Plus by NeuralWare, Inc., a objects located at discrete locations in commercial neural network builder. The coordinate space, unlike raster data which neural network contains six spectral class

81 10 17255 50 0 -1 0 2 0 -1 0 0 2550 0 20 50 255 20 0 -1 0 2 0 -I 0 0 2550 0 15 20255 20 0 -1 0 2 0 -I 0 0 2550 0

Input image values Line filter Output image values (first band)

Figure 4. Spatial enhancement: line detector (vertical, bright) inputs (blue, green, red, near-infrared, mid- 4 shows this application. The filters are infrared-band 5, and mid-infrared-band 7), designed to detect bright lines in Landsat TM nine hidden units, and 17 land cover class data in eight different directions. The line outputs (main channel, shallow channel, filter in Figure 4 is designed to detect vertical deep lake, shallow pond, lake edge, runway, lines. A two band image is derived as output building, trees, grass, sand, pavement [1-2], from the filter application. The first band is clay, vegetation [1-4]). composed of the highest value retumed for each pixel from all of the applied filters. The The neural network was trained with second band is the direction associated with a file containing 1000 pixel values from each the filter retuming the highest value for each land cover class. The resulting spectral pixel. The two band output image can them classification was visually verified against an be interpreted as line segments with April 10, 1990, 1 inch to 400 foot associated intensities and directions. In panchromatic aerial photograph of the study Figure 4, the bright pixels in the center of area. The aerial photograph provides a good the input image are highlighted from the representation of the land cover within the background pixels in the output image. St. Louis image. The neural network identifies all the target land cover classes. It The line detection algorithms, written also identifies more of the mixed pixel by Regional Research and Development pavement surfaces than the ERDAS Service (RRDS) at Southern Illinois supervised classification technique. University at Edwardsville (SIUE), were composed of ERDASbatch files and C code. 3.2 Low-level Feature Extraction The algorithms were then applied to Band 3 of the Landsat TMimage of St. Louis. This The PRC research and development band is highly responsive to cultural efforts focusing on automated low-level features. The results of the line detection feature extraction have concentrated on line algorithm on the St. Louis data are detection algorithms. The basic line encouraging. A visual inspection based on detection algorithms applied were derived the airphotograph mentioned earlier shows from work presented by Dr. Jezching Ton that virtually all of the major lines and most [TON89]. The Ton algorithms were chosen of the minor lines are detected. It is believed because of their direct application to road that the low resolution of Landsat TMdata is detection, reportedly high success rate, and responsible for the difficulty in detecting straight forward development. some lines representing roads narrower than While Ton compiled his computer 30 meters. We feel confident that the line algorithms in Pascal, our approach was to detection algorithms we have developed develop batch files in ERDAS,a commercial provide accuracy levels high enough to be image processing package. The line used in an automated feature extraction detection algorithms were tested on the TM system. The true test will be the accuracy image described in the previous section. The levels achieved by the high-level feature basic Ton algorithms involve the application identification algorithms that utilize the line of a special set of spatial enhancement detection results as input. techniques called convolution filters. Figure 3.3 High-level Feature Extraction would have been obscured by foliage in the September Landsat imagery. The first high-level feature extraction research has focused on road identification 4. Conclusion algorithms. These algorithms attempt to integrate the results received by the spectral Our research has proven the basic land cover classifications and the low-level approach taken in image understanding. line detection algorithms. The road The neural network land cover classification identification algorithm is a raster model algorithm has been effectively demonstrated developed to replicate a simple expert system in general land cover classification and based decision algorithm. specifically in classification of material associated with cultural features. The image The road extraction algorithm was processing based low-level feature extraction applied to the St. Louis TM data. As technique has been proven effective and mentioned earlier, a neural network efficient in extracting linear features. The classification includes weights from zero to expert system integrated high-level feature one representing the probability of the pixel extraction algorithm proves to be more corresponding to each class. The classes effective than traditional computerized with the two highest probability for each classification techniques and as effective as class were chosen as input to the road manual image interpretation. PRC intends detection algorithm. These two levels of to continue with research and development neural network material classification were efforts aimed at producing a prototype image then recoded to pavement and non- understanding system based on the pavement materials. This information was approach defined in this report. then compared with the low-level line feature outputs in a simple rule based model. The Bibliography output of the model depicts major and minor roads within St. Louis. BISC92 Bischof, H., Schneider, W., and Pinz, A. "Multlspectral The output was compared with truth Classification of Landsat-lmages data to obtain quantitative accuracy levels. Using Neural Networks." IEEE The truth data was digitized from the Transactions on Geoscience and panchromatic airphotograph of the study Remote Sensing, May 1992, pp area. The airphotograph was manually 482-490. digitized and rectified to the UTMcoordinate CARP92 Carpenter, Gail and Grossberg, system of the TMimage. The final digitized Stephen. "A Self-Organized truth data were then registered and overlaid Neural Network For Supervised onto the results of the high-level road Learning, Recognition, and extraction algorithm. This overlay provided Prediction."IEEE a pixel based accuracy metric for the road Communications, September extraction results. This showed accuracy 1992, pp 38-49. levels of 96% for major roads within the FUA87 Fua, Pascal & Hanson, Andrew. control area and 91% for secondary roads. "Using Generic Geometric Models The omission of most road pixels appears to or Intelligent Shape Extraction." be due to the poor resolution of the Landsat Proceedings: DARPA Image TMdata and because of the time of year the Understanding Workshop, St. Louis image was taken. The TM image February 1987, pp 227-241. was taken in early September, when the HARV87 Harvey, Wilson, Diamond, trees were still fully canopied. The truth Matthew & McKeown, David Jr. data photograph was taken early the "Tools for Acquiring Spatial and following April when the canopies were Functional Knowledge in Aerial minimal. The airphotograph reveals that Image Analysis." Proceedings: many of the roads omitted by the road DARPA Image Understanding detection algorithm had trees over them and Workshop, February 1987, pp

83 857-873. Aerial Image Analysis." HARW87 Harwood, David. "Interpretation Proceedings: DARPA Image Aerial Photographs by Understanding Workshop, May Segmentation and Search." 1989, pp309-331. Proceedings: DARPA Image THOM87 Thompson, D. & Mundy, J. Understanding Workshop, "Model-Directed Object February 1987, pp 507-520. Recognition on the Connection HUER87 Huertas, A., Cole, W. & Nevatia, Machine." Proceedings: DARPA R. "Detecting Runways in Aerial Image Understanding Workshop, Images." Proceedings: DARPA February 1987, pp98-106. Image Understanding Workshop, TOML90 Tomlin, C. Dana. Geographic February 1987, pp 273-339. Information Systems and JONE87 Jones, William and Hoskins, Cartographic Modeling. Josiah. "Back-Propagation: A Englewood Cliffs, New Jersey: Generalized Data Learning Rule." Prentice-Hall, Inc., 1990. Byte, October 1987, pp 155-162. TON89 Ton, Jezching. "A Knowledge NEUR91 "NeuralWorks Professional II/Plus Based Approach to Landsat Image and NeuralWorks Explorer Neural Interpretation." Doctoral Computing." Pittsburgh: Dissertation, Michigan State NeuralWare, Inc., 1991. University: Department of PERL89 Perlant, Frederic and McKeown, Computer Science, 1989. David. "Scene Registration in