·Rucopr· EXPERIENCE with the USE of SUPERCOMPUTERS to PROCESS LANDSAT DATA
Total Page:16
File Type:pdf, Size:1020Kb
·rUCopr· EXPERIENCE WITH THE USE OF SUPERCOMPUTERS TO PROCESS LANDSAT DATA MARTIN OZGA , Research Section Remote Sensing Branch Statistical Research Division Statistical Reporting Service United States Department of Agriculture Washington, D.C. 20250 March 1984 ***presented at the 1984 International Symposium on Machine Processing of Remotely Sensed Data Purdue University, West Lafayette, Indiana, June 1984. EXPERIENCE WITH THE USE OF SUPERCOMPUTERS TO PROCESS LANDSAT DATA MARTIN OZGA STATISTICAL REPORTING SERVICE, U.S. DEPARTMENT OF AGRICULTURE ABSTRACT to be very useful and cost-effective. For the purposes of this paper, a supercomputer is a Single Instruction Multiple Data (SIMD) machine The Statistical Reporting Service of the characterized by very fast processing in a vector or United States Department of Agriculture has had parallel mode in which several items of data are to process large amounts of Landsat data in its being operated on simultaneously. The items of data project to improve the precision of crop acreage in Landsat processing are, of course, the pixels. Due estimates by using Landsat data. Supercomputers to the large amount of data handled, high I/O have been found to be very useful and cost- transfer rates are necessary for Landsat applications. effective tools in handling this processing load. Present supercomputers achieve parallel processing either by pipelined operation or by use of I. INTRODUCTION multiple processors. The two series of supercomputers currently commercially available, the CRA Y series from CRA Y Research, Inc., and the The Statistical Reporting Service (SRS) of CYBER 200 series from Control Data Corporation the United States Department of Agriculture (CDC), are pipelined machines. Examples of multi- (USDA) uses Landsat data to improve the precision processor machines include the ILLIAC-IV (now out of SRS crop acreage estimates. This use of of existence) and the experimental Massively Parallel Landsat data supplements ground data obtained by Processor. SRS enumerators who visit farms in selected areas known as segments. In the SRS Landsat Typically, supercomputers have a 64-bit word procedures, the ground data are used to label size to facilitate scientific computation. Some have Landsat data for supervised classifier training via a half-word or 32-bit mode which runs faster for within-class-type clustering and are also used in programs not requiring such high precision. Many the calculation of the Landsat-based estimates of Landsat processing programs are among that group. crop acreage (Ozga, 1976). Where sufficient ground and Landsat data are available, all Landsat A supercomputer will often have a front end scenes for entire states are completely classified-- processor. The front end processor, typically a some scenes more than once in attempts to mainframe or a super-mini, is used to gather data optimize the estimate. The most desirable Landsat from various inputs (such as magnetic tape) and scenes are those acquired late in the growing transfer it to the high speed disk connected to the season. With delays in receiving these scenes and supercomputer. Output data is collected from the with a mid-December deadline to generate supercomputer disk and either displayed for the user estimates, a large number of scenes must be or saved for further processing. The high cost of processed in a relatively short time. In 1983, crop supercomputers often makes it too costly for the estimates were calculated for seven states-- supercomputer to do such data handling directly. Arkansas, Illinois, Iowa, Oklahoma, Kansas, Missouri, and Colorado (eastern part only)--for a USDA-SRS used the ILLlAC-IV until it was total of 60 scenes. Each of these scenes was discontinued and has since switched to its classified at least once and many more than once. replacement, a CRA Y X-MP. In addition, SRS has Moreover, a large amount of clustering was investigated the CYBER 200 series and plans to required. In addition, a land cover study of investigate the Massively Parallel Processor. Missouri was performed using II scenes of which 9 were multi temporal. For this heavy, concentrated processing and data transfer load, SRS has found supercomputers • ~": •.•.. "'" :'4'~' .• II. THE ILL.lAC-IV operations are performed from eight 64~vf8rd vector registers. The main language used is FOR TRAN-77. The current SRS remote sensing project has There are no syntactic extensions for vector evolved, in part, from a project at the University operations. The compiler analyzes DO-loops to of Illinois to investigate the use of the ILLIAC-[V determine if they may be vectorized. In the case of for Landsat data processing. The ILLIAC-IV was nested loops, an attempt will be made to vectorize developed by the University of Illinois under the inner loop only so loops must be coded carefully contract with the Department of Defense and was and in the proper, order to be vectorized (CRA Y CFT, actually constructed by Burroughs Corporation 1981) (Petersen, 1983). Special vector procedures are (Hord, 1982). The ILLIAC-IV consisted .of 64 available for some operations which may not be processors each of which could access directly directly expressed in FORTRAN. Assembly language 2048 words in a shared memory. ]t had the usual is available for writing functions which cannot be 64-bit word but could operate in 32-bit mode efficiently expressed in FORTRAN. giving nearl~ the effect of 128 processors •. This configuration was excellent for. a maXlmllln The CRA Y at NASA-Ames currently has likelihood classification algOrithm since the pixels 2,000,000 64-bit words of main memory. Although are treated independently. However, for efficient not all of this is available fot user programs (some operation, the mean and covariance matrices for being reserved for the operating system), memory all categories had to be stored in the memory of size has not been a problem in any of our programs. each processor thus limiting the number of There are also 1,000,000,000 64-bit words of high- categories although the limit chosen (64) ~as speed disk storage. Most of this is available for generally sufficient. The ILLIAC-IV had a high temporary use by user programs although a small speeed disk, adequate for handling one full frame portion is taken up by permanent storage of user and of MSS data. system files. A MSS scene takes up about 5,000,000 words of this disk storage. The CRA Y has several The [LLIAC-IV was installed at NASA-Ames front end processors, a Control Data CYBER 170 and and became quasi-operational in the mid-to late- one or more VAX 11/780s. SRS uses a DEC-IO at 1970's. [t was plagued by hardware problems due Bolt Beranek and Newman (BBN) for most of its to components which were then obsolete. Since processing. This DEC-IO is connected to the only one ILLIAC-IV was ever built, maintenan~e ARPANET, a Department of Defense computer costs were high. The front end processor, a DEC- network, as is one of the VAXes, at NASA-Ames. lO (TENEX) was inadequately configurated for Smaller flies are transferred directly to the VAX disk Landsat processing since it provided neither the to be transferred to the CRA Y when called for in large storage nor the high speed transfer ~ates jobs. Larger files, that is full scenes, are mailed on necessary for the large amounts of data reqUired. tapes. These are stored at NASA-Ames and, when The front-end DEC-IO was also heavily used for called for in CRA Y jobs, are read by the CYBER with other processing. This situation improved when the contents sent to the CRA Y disk. direct transfer from tape to ILLIAC disk was implemented. However, neither the disk nor the Currently, SRS uses the CRA Y for maximum main memory was big enough for more than one Job likelihood classification, the CLASSY clustering at a time so that users were charged full algorithm, aggregation of classified files, and block processing rates for time spent in data transfer. [n correlation to create multitemporal scenes. The spite of all these problems, SRS was able to do a classification program is available in two versions: a significant amount of processin~. ~n the lLLlAC-IV four-channel version which is heavily optimized and a and it provided better faCIlities than other two to sixteen channel version which is somewhat machines available at the time. Typically, 5 to 10 less optimized but still quite efficient. The minutes were required to classify a scene, aggregation program is not a vectorized program but depending on the number of categories. This rather is kept on the CRA Y since it takes the large included data transfers to and from ILLIAC disk classified file along with other inputs and creates a but not data transfer from tape. small tabular file. The CLASSY clustering program has so far not been effectively vectorized, but since The lLLIAC-IV was discontinued in the CRA Y is the only really powerful machine to September, 1981, and replaced by a CRA Y I-S which SRS has access, it is more efficient to use since upgraded to a CRA Y X-MP. CLASSY on the CRA Y than at BBN. The block correlation program has been very efficiently implemented on the CRA Y since the correlation III. CURRENT OPERA nONS ON THE CRA Y process lends itself well to vectorization. A. OVERVIEW B. MAXIMUM LIKELIHOOD CLASSIFICATION The CRA Y is a pipelined machine built by CRAY Research, Inc (CRAY-l S, 1981). It has t~e The maximum likelihood classify program is a usual 64-bit word but does not support a 32-blt nearly ideal program for any sort of vector machine. word mode.