Assessing Raster Representation Accuracy Using a Scale Factor Model
Total Page:16
File Type:pdf, Size:1020Kb
Assessing Raster Representation Accuracy Using a Scale Factor Model Jeong Chang Seong and E. Lynn Usery Abstract using six equal-area projections: Interrupted Goode Homolos- Raster datasets of global and continental extent are subject to ine, Interrupted Mollweide, Wagner IV, Wagner W, Lambert error resulting from projection transformation. This paper Azimuthal Equal Area, and Oblated Equal-Area projections. examines the error problem from a theoretical perspective and They quantified and graphically depicted shape and scale dis- develops a model to calculate the extent of the errors. The tortions caused by reprojection. However, because their theoretical examination indicates that error results in two research used sample grids that were drawn on already pro- forms, areal size change of pixels and categorical error re- jected maps, the application is limited since no theoretical sulting from loss or duplication of pixels. A scale factor model, background or models to estimate and simulate the pixel value based on the horizontal and vertical scale factors of the changes in various projection change situations were pre- projection, is developed to provide a computation of the sented. It is the purpose of this research to investigate the effect resulting error from specific projections. The model is of projection distortion on raster representation at a global scale experimentally tested with the cylindrical equal area, and develop a scale factor model to simulate the effect. The sinusoidal, and Mollweide projections. Results indicate that next three sections of this paper provide a theoretical approach the model predicts error within one percent of actual values to the assessment of raster representation accuracy based on a and that the sinusoidal projection is subject to smaller errors scale factor model. The fifth section develops an experimental in projecting raster data than the other projections tested. design and tests this approach using three specific projections. After the extension of the model to other types of projections is discussed, some conclusions based on this work are provided in Introduction the last section. With society's increasingly global perspective and global mod- eling needs, raster databases of geographic phenomena and Errors in Raster Representation and Transformation processes at continental and global scales have been con- The transformation of features at regional and global extents structed using a number of different map projections. These using the raster data structure brings two distinct problems raster databases exist at a variety of resolutions and have been which may be labeled as areal size and category. The areal-size generated from many different data sources. For example, problem results from a transformation which causes the area Steinwand (1994) coded algorithms of the Interrupted Goode represented in the raster data format to be unequal to the size of Homolosine projection for coarse-resolution global data sets. the same area represented in the vector data format unless the Tobler et al. (1995) used unprojected spherical quadrilateral pixel resolution is infinitely small. Practically, rather than grids to map global population data. Also, Lowman et al. infinity, vector and raster resoll~tionsconverge to the same rep- (1999)used the Robinson projection to map tectonic and volca- resentation at a pixel size equivalent to the smallest number nic activity around the world. The U.S. Geological Survey also which can be represented in the computer, which also corres- distributes digital raster datasets to the public to assemble ponds to the smallest possible pixel size and the smallest vector global databases of vegetation, climate, land cover, and eleva- distance which can be represented between two points. tion at resolutions of one degree, one-half degree, one kilome- The category problem is that some categories may be lost or ter, and 30-arc seconds, respectively. Dobson et al. (2000) duplicated when vector features are represented in raster for- developed the LandScan population database for the world on mat. There is no loss or gain of categories among equal-area pro- a 30-arc-sec grid. These publicly available data may be in geo- jections in the vector data structure because original features graphic coordinates or a projected coordinate system. In either will remain after projection. However, the situation is quite dif- case, users who wish to combine datasets through geographic ferent in the raster data structure. Suppose there is araster data information system (GIS) software are faced with a problem of layer with a 1-km pixel resolution in the Interrupted Goode map projection transformation and little guidance concerning Homolosine projection with 255 categories and the data are the accuracy of the projection transformation process (Cliffe, reprojected to the Albers conic equal-area projection with the 2000). same 1-km pixel resolution. Unlike the flexibility of vectorrep- Usery and Seong (2001) have demonstrated that continen- resentation, the rigidity of raster representation and the require- tal areas are subject to significant distortion if raster data are ment of nearest-neighbor interpolation for categorical data projected. Steinwand et al. (1995) identified the effects of map cause the frequency of each category to be changed, sometimes projection properties on data quality in global change studies significantly. As an example, a subset of the Eurasia dataset of J.C. Seong is with the Department of Geography, Northern Photogrammetric Engineering & Remote Sensing Michigan University, 1401 Presque Isle Ave., Marquette, MI 49855 ([email protected]). Vol. 67, No. 10, October 2001, pp. 1185-1191. E.L. Usery is with the Department of Geography, University 0099-1112/01/6710-1185$3.00/0 of Georgia, 204 GGS Building, Athens, GA 30602 (usery@ 0 2001 American Society for Photogrammetry arches.uga.edu; [email protected]). and Remote Sensing PHOTOGRAMMETRIC ENGINEERING & REMOTE SENSING October 2001 1185 the global land-cover characterization (GLCC)data (Eidenshink representing the original feature's size, which yields an 80 per- and Faundeen, 1994;Lauer and Eidenshink, 1998;Brown, cent representation accuracy. 1999) was reprojected from the Interrupted Goode Homolosine The error occurs when the original shape is distorted. lf a projection to the Albers equal-area conic projection and the projection changes an original shape severely, we may expect geographic latitude and longitude system. The subset covered an increased error. The number of categories and spatial auto- the latitude range of 47.1" N through N 53.7"and the longitude correlation also affect the error. If there is only one category, the range of 15.3"E through E 32.0"with 37 global ecosystem land- highest possible spatial autocorrelation, the frequency will be cover categories. The reprojection result showed 2.7 percent the same as the total number of pixels. On the contrary, if there and 35.5 percent errors when the data were reprojected to the are many categories in low spatial autocorrelation, projection Albers equal-area conic projection and geographic latitude1 may bring significant frequency changes. Error varies directly longitude system, respectively. with the spatial autocorrelation of the image; lower spatial The difference among categorical value frequencies autocorrelation yields lower accuracy and higher spatial auto- occurring at the projection stage is not easily visible because correlation yields higher accuracy. At small spatial extents, as the total area is retained in the projected image if an output pro- with local raster databases, the error will not be as significant jection is an equal-area projection. However, the error is obvi- as with global databases because of relatively small shape dis- ous when the frequency of each category is compared between tortions. In global-scale databases, significant error is expected the two images (Usery and Seong, 2001). Categorical values due to the severe shape distortion required to maintain areal may be gained or lost depending on the resulting shape of the equivalency. reprojected features and the arrangement of raster pixels. Modeling the Accuracy of Raster Representation of Equal-Area Characteristics of Categorical Errors in Raster Representation Projectsons In an equal-area projection, a feature on the globe projected to a Equivalency can be maintained by distorting shape. Specifi- flat surface should be represented with its original size. If the cally, when the multiplication of the vertical and horizontal feature's size is not represented correctly, the difference scale factors is 1.0,equivalency is maintained (Bugayevskiy et between the original size and the projected size is the extent of al., 1995; Yang et al., 2000). In most map projections, the hori- the error. zontal and vertical scale factors are represented as mathemati- In Figure 1, assume the symbol 'x' is the center of each cell. cal equations. The change of horizontal and vertical scales If a rasterization algorithm uses the feature in the vector repre- results in shape change of features. If a vector data structure is sentation occurring at the center of a pixel for representing the used, the change of scales does not affect the representation of pixel in the raster representation, features may be ignored. In categories because all features will be represented in the pro- the case of projection A, the XI,x2, ~4,and x5 features are lost. jected vector database. However, if a raster database is used, Projection B shows a feature loss of only ~5.However, features severely reduced features may be ignored because they are not also may be over-represented as in projection B, where ~2is large enough to be represented by a pixel that is much larger duplicated. Therefore, the total error extent is composed of the than the reduced feature size. Figure 2 shows the relationship pixels that are ignored and those that are over-represented. In between scale change and raster representation. Original fea- other words, the accuracy of raster representation is the size of tures are shown in distorted shape in thinner lines. New raster- correct categorical representation.