Doctoral Thesis Research Proposal Igor Drecki

GEOGRAPHICAL INFORMATION UNCERTAINTY: THE CONCEPT AND REPRESENTATIONAL CHALLENGES

Igor Drecki

School of Geography, Geology and Environmental Science

The University of Auckland

Auckland, New Zealand

Abstract

Understanding and representing geographical information uncertainty poses a significant challenge in geographical information science (GIScience) research. The concept of information uncertainty is not well defined and has different interpretations across many disciplines of knowledge. Ambiguous terminology used in uncertainty characterisation is one of the impediments to effective representation of information uncertainty. The development of representation tools to assist researchers in understanding and dealing with geographical information uncertainty has been underway for over a decade. However, these efforts lack comprehensiveness in their approach to representing information uncertainty by not considering all known or desirable factors that influence visualisation of information uncertainty for a particular purpose.

This paper examines the nature of geographical information uncertainty by briefly discussing the concept of uncertainty and its relevance in GIScience. The challenge of representing geographical information uncertainty in a comprehensive way is identified and a strategy that involves considering all known or desirable parameters that influence representation of uncertainty for a particular purpose is proposed.

The Concept of Geographical Information Uncertainty

Conceptualising and representing geographical information uncertainty presents a considerable challenge in GIScience research. One of the fundamental shortcomings is the lack of commonly accepted meaning of uncertainty. The term ‘uncertainty’ is used in a number of fields adding further confusion to its definition. Uncertainty in mathematics, for example, is closely associated with probability theory which describes the occurrence (or lack of) of certain ‘events’ as random. Statistics often refers to the standard deviation of a single value within a given range of values, as a measure of uncertainty. In cognitive psychology, uncertainty could be a matter of (human) condition, such as insecurity or expectation.

When it comes to uncertainty associated with geographical information, the situation is equally complex. Buttenfield (1993) identifies ambiguous terminology used in uncertainty characterisation, as one of the impediments in effective representation of information uncertainty. Terms, such as accuracy, error, data quality or reliability are often used interchangeably, each containing a degree of ambiguity when applied to describing uncertainty. For comprehensive discussion on the definition of uncertainty, see e.g. MacEachren (1992), Buttenfield (1993), Pang (2001), Leyk et al. (2005) or Griethe and Schumann (2005, 2006).

In this research, uncertainty is understood as the lack of objective knowledge about accuracy in geographical information (Hunter and Goodchild, 1993). Although not specifically defined, uncertainty covers a broad concept of vagueness or doubt, and includes accuracy as a key component. It forms a part of decision making process with geographical data and influences the analyst’s trust in any dataset, thus making uncertainty a critical consideration in GIScience.

Representing Information Uncertainty

Visualisation provides a valuable approach for representing and interacting with geographical information in general. It is seen as a recognised method in GIScience research, where maps and visual displays are part of a research sequence (DiBiase et al., 1992). Drecki (2002) argues that use of visual means to represent spatial relationships that cannot be easily represented by other forms of expression (e.g. textual, verbal or sonic) is particularly advantageous. Visualisation engages the principal human information- processing ability, that associated with vision (Buttenfield and Mackaness, 1991; MacEachren et al., 1992) and could provide immediate, compact and well structured information to a reader. It is particularly useful when geographical information is used in decision making and information analysis, where the implications of using uncertain information may be significant (Drecki and Maciejewska, 2005). It is also an important consideration when assessing suitability of data for geographic information systems (GIS) operations or for a particular application.

Visualisation of information uncertainty associated with geographical information has been a subject of increasing attention from researchers since the beginning of 1990s. An early important influence was the National Center for Geographic Information and Analysis (NCGIA) Research Initiative on Visualisation of Spatial Data Quality (Beard et al., 1991). The Initiative played a significant role in addressing a need for exploration, development and evaluation of visual techniques to communicate uncertainty.

Following that, MacEachren (1992), McGranaghan (1993) and van der Wel et al. (1994) compiled a list of potential visualisation tools that are available for representation of information uncertainty. They evaluated the extended set of visual variables pointing out the suitability of some variables, for example colour saturation or focus, over others. Krygier (1994) and Fisher (1994) suggested sonification as a valid complementary method for representing and exploring uncertainty.

Other authors developed specific visual techniques and applications to represent information uncertainty associated with a particular scenario. For example, colour transformation technique was utilised by Jiang et al. (1995) in the application of a modified HSI (hue, saturation, intensity) colour system for visualising uncertainty in fuzzy overlay operations, while Hengl (2003) used a similar principle to represent uncertainty associated with spatial prediction of continuous and discrete variables in soil and landform mapping. Drecki (1997, 2002) developed opacity and ‘squares’ techniques to visualise attribute uncertainty derived from the classification of a remotely sensed imagery.

Among the interactive tools and environments employed in visualising information uncertainty, Fisher (1993, 1994) used animation to create a blinking effect where the displayed soil grid cells change their colour according to the proportion of being assigned to one of the existing soil classes. MacEachren et al. (1993) developed an interactive application based on the principle of ‘quality slider’ to visualise the uncertainty in Chesapeake Bay monitoring data. In complementary work, Lucieer and Kraak (2004) employed interactivity to create exploratory visualisation environment to assist analysts in classification of remotely sensed imagery and related uncertainty. Work by Clarke et al. (1999) focused on application of visual depth in a virtual environment to represent positional uncertainty associated with a road network.

The majority of the above approaches to visualising information uncertainty are centred on the use of visual variables (Bertin, 1983) and their extensions, such as opacity, colour saturation or clarity (Drecki, 2002; MacEachren, 1994). Others utilise virtual environments (e.g. Clarke et al., 1999) or sophisticated computer-based applications and geovisualisations (e.g. Lucieer and Kraak, 2004; Slocum et al., 2003).

A Comprehensive Approach

For over a decade, the development of representation tools to assist researchers in understanding and dealing with geographical information uncertainty has been underway. Many visualisations developed to date, usually focus on representing a single parameter of uncertainty (such as positional accuracy or lineage), or just a narrow selection of parameters. Consequently, they lack comprehensiveness in their approach to depicting information uncertainty by not considering all known or desirable factors that influence representation of information uncertainty for a particular purpose. Furthermore, as MacEachren et al. (2005) point out, little is known about parameters that help to create successful uncertainty representations. As a result, geographical information analysis is not well supported by appropriate and comprehensive statements on information uncertainty, often leading to questionable results and uninformed decision making.

In this research, the comprehensive approach relates to considering all known or desirable parameters that influence visualisation of information uncertainty for a particular purpose. For example, Thomson et al. (2005) argue that considering nine categories of uncertainty, i.e. accuracy, precision, completeness, consistency, lineage, currency, credibility, subjectivity and interrelatedness, is necessary to support geospatial intelligence analysis in a comprehensive way. However, in the context of national topographical information, interrelatedness (source independence from other information (Thomson et al., 2005)), although it is a known parameter of uncertainty, might have no relevance. As illustrated above, the approach to visualising information uncertainty needs to be adaptable to both a particular dataset and to its purpose. Frameworks for Representing Uncertainty

There is a clear niche in the current research with regards to the comprehensive approach to visualising geographical information uncertainty. The initial efforts on formalisation of uncertainty visualisations resulted in the development of Spatial Data Transfer Standard (SDTS)(NIST, 1992). The categories of data quality in the SDTS include lineage, positional accuracy, attribute accuracy, logical consistency and completeness. Currency, although not regarded as a data quality category within SDTS, has its place in the uncertainty visualisation framework (Beard et al., 1991; Beard and Mackaness, 1993). This development provided an initial attempt at dealing with uncertainty comprehensively.

Recent research on a typology for visualising uncertainty suggested inclusion of further categories, namely precision, credibility, subjectivity and interrelatedness, as well as combining positional accuracy and attribute accuracy into a single accuracy category (MacEachren et al., 2005; Thomson et al., 2005). These categories can be put into a matrix, whereby each category of uncertainty is matched with a distinction among location (position), attribute and time components of data (MacEachren, 1994; Thomson et al., 2005). The following list gives brief definitions for each uncertainty category (all quotes from Thomson et al., 2005):

 completeness: extent to which information is comprehensive

 consistency: extent to which information components agree

 lineage: conduit through which information passed

 currency: temporal gaps between occurrence, information collection and use

 credibility: reliability of information source

 subjectivity: amount of interpretation or judgement included

 interrelatedness: source independence from other information

 accuracy: difference between observation and reality

 precision: exactness of measurement Although the above research does not specifically deal with developing visual tools for representing information uncertainty, it provides a sound base for extending research in this direction. A useful attempt has been proposed by Drecki and Maciejewska (2005) in their approach to visualise uncertainty associated with a large-scale contemporary topographic database. They introduced a concept of global visual uncertainty indicator that privides an aggregate of scores, based on a certain set of rules, for each of the above categories of uncertainty. This indicator provides an immediate ‘overview’ of the overall information uncertainty (Fig 1), alerting users to the local variations and so called ‘hot spots’ in the data (Drecki and Maciejewska, 2005).

As an extension of Drecki and Maciejewska (2005) research, introduction of an additional indicator, which synthesise and visualise the components of data, i.e. location (position), attribute and time for a particular category of uncertainty, could be pursued. The development of a link between this indicator and the global visual uncertainty indicator seems to be a logical expansion of this research.

Another approach in dealing with information uncertainty comprehensively has been proposed by Leyk et al. (2005). Their research focused on presenting a conceptual framework for systematic investigation of uncertainty associated with land cover change modelling from historical map data. Their model consists of three ‘domains’ of uncertainty, i.e. production-, transformation- and application-oriented in which sources of uncertainty are systematically exposed (Leyk et al., 2005). The following list gives brief definitions for each domain (all quotes from Leyk et al., 2005):

 production-oriented uncertainty: the amount of uncertainty inherent in the source data

 transformation-oriented uncertainty: the amount of uncertainty caused by data processing and editing

 application-oriented uncertainty: the amount of uncertainty dependent on the intended application

These descriptions are linked with earlier work on the recognition and assessment of error in spatial data. The first two domains are related to the source error and processing error respectively (Walsh et al., 1987), while the third one is related to the use error (Beard, 1989). This framework provides a suitable basis for comprehensive investigation of geographical information uncertainty (Leyk et al., 2005). Furthermore, it presents a potential for developing comprehensive approach to visualising uncertainty, although no research to this effect has been reported.

Figure 1. Representation of global visual uncertainty indicator.

Dealing with Geographical Information Uncertainty Comprehensively

The issue of a comprehensive approach to visualising geographical information uncertainty has not been addressed in the relevant literature to date. As a consequence, there is no clear methodology in researching this topic. However, a preliminary attempt has been documented by Drecki and Maciejewska (2005). They adopted a typology proposed by Thomson et al. (2005) to visualise uncertainty associated with a large-scale topographical database. They developed a set of specific and explanatory visualisations to support each of the uncertainty categories within the above typology. The resultant representations offered a direct view of uncertainty and provided means to deal systematically and comprehensively with uncertainty either inherent in the database or caused by information processing. The methodology employed by Drecki and Maciejewska (2005) involved:

 selecting geographical dataset

 gathering meta-information

 matching meta-information with categories of uncertainty

 adopting strategy for uncertainty visualisation, including standardisation

 evaluating uncertainty visualisations

A large-scale topographical database for the Province of Burgos in central Spain provided an ideal dataset for investigation of uncertainty patterns. The dataset, although designed to be seamless, consisted of 550 quadrangles of long 5’ and lat 2.5’ each of which 60% has been covered. Access to meta-information detailing aspects of topographical data handling relevant to the adopted typology offered a suitable base. Information on data collection, processing and analysis for each of the mapped quadrangles was well documented.

The matching of meta-information with uncertainty categories described by Thomson et al. (2005) was performed. For example, information related to the experience of an analyst performing spatial operations was matched with the credibility category, while details on the age of source materials and the time of their processing with the currency category.

Geographical information and related uncertainty can either be visualised separately or in combination (MacEachren, 1992). The former refers to a situation, where data is visualised independently from the uncertainty information. The latter is applicable when both data and uncertainty are incorporated in the same view, which is often called a bivariate representation. In the case of Drecki and Maciejewska (2005) research, separate representation was chosen. Since different categories of uncertainty were reported using wide array of qualitative and quantitative measures, a strategy for their consistent representation was critical. This was achieved by adopting a 5-step ordinal scale where a diverging colour scheme(Brewer, 1994) signified the uncertainty distinctions (see Fig 1). Drecki and Maciejewska (2005) suggested a proper empirical testing and evaluation of the uncertainty visualisations developed in their case study, although no methodology for conducting such evaluation has been included. They concluded that presented visualisations are valid from a methodological standpoint offering a systematic view of uncertainty across all data quality categories and providing platform for more comprehensive studies of uncertainty. However, their research did not take into account uncertainty arising from the inappropriate application of information (Aronoff, 1989; Beard, 1989).

Conclusion

Visualisation is a powerful tool in providing immediate, compact and well structured information about uncertainty (Drecki, 1997; van der Wel et al., 1994). As noted, a systematic and comprehensive approach in dealing with geographical information uncertainty is becoming increasingly pressing (Drecki and Maciejewska, 2005). Geographical information is used in spatial analysis and decision making, where implications of using uncertain information may have fundamental implications for the reliability of analytical results. Buttenfield (1993) identifies the lack of methods for depicting uncertainty as one of the impediments in effective visualisation of information uncertainty. MacEachren et al. (2005) note that still little is known about parameters that help to create successful uncertainty representations.

The above frameworks for visualising uncertainty proposed by Drecki and Maciejewska (2005) and Leyk et al. (2005) constitute a sound foundation for the development of comprehensive visualisations of geographical information uncertainty. As a consequence there is a clear need for establishing a link between current frameworks for dealing with geographical information uncertainty and appropriate visualisations. These visualisations need to be based on sound principles of cartographic theory. Finding the parameters that support successful uncertainty representations constitutes a key challenge.

References

Aronoff S, 1989, Geographic Information Systems: A Management Perspective, WDL Publications, Ottawa, Ontario. Beard M, 1989, Use error: The neglected error component, Proceedings of the 9th International Symposium on Computer-assisted Cartography Auto-Carto 9, Baltimore, Maryland, 808-817.

Beard M, B Buttenfield and S Clapham, 1991, Visualization of Spatial Data Quality, Scientific Report for the Specialists Meeting, NCGIA Technical Paper 91-26, Santa Barbara, California.

Beard M and W Mackaness, 1993, Visual access to data quality in geographic information systems, Cartographica, 30 (2&3), 37-45.

Bertin J, 1983, Semiology of Graphics, University of Wisconsin Press, Madison, Wisconsin.

Brewer C, 1994, Color use guidelines for mapping and visualisation, in: A MacEachren and D Taylor (eds), Visualisation in Modern Cartography, Elsevier, Oxford, 123-147.

Buttenfield B, 1993, Representing data quality, Cartographica, 30 (2&3), 1-7.

Buttenfield B and W Mackaness, 1991, Visualisation, in: D Maguire, M Goodchild and D Rhind (eds), GIS: Principles and Applications, Longman, London, Vol 1, 427-443.

Clarke K, P Teague and HG Smith, 1999, Virtual depth-based representation of cartographic uncertainty, in: Shi W, M Goodchild and P Fisher (eds), Proceedings of the International Symposium on Spatial Data Quality’99 (ISSDQ’99), Hong Kong Polytechnic University, Hong Kong, 253-259.

DiBiase D, A MacEachren, J Krygier and C Reeves, 1992, Animation and the role of map design in scientific visualisation, Cartography and Geographic Information Systems, 19 (4), 201-214, 265-266.

Drecki I, 1997, Visualisation of Uncertainty in Spatial Data, unpublished MSc thesis, University of Auckland, Auckland.

Drecki I, 2002, Visualisation of uncertainty in geographical data, in: W Shi, P Fisher and M Goodchild (eds), Spatial Data Quality, Taylor & Francis, London, 140-159 (colour plates 1-3). Drecki I and I Maciejewska, 2005, Dealing with uncertainty in large-scale spatial databases, Proceedings of the 22nd International Cartographic Conference (CD-Rom), ICA, A Coruna, Spain, 9-16 July.

Fisher P, 1993, Visualising uncertainty in soil maps by animation, Cartographica, 30 (2&3), 20-27.

Fisher P, 1994, Animation and sound for the visualisation of uncertain spatial information, in: H Hernshaw and D Unwin (eds), Visualisation in Geographical Information Systems, John Wiley & Sons, Chichester, 181-185.

Griethe H and H Schumann, 2005, Visualizing uncertainty for improved decision making, Proceedings of the 4th International Conference on Business Informatics Research BIR 2005, University of Skovde, Skovde, Sweden, 3-4 October.

Griethe H and H Schumann, 2006, The visualisation of uncertain data: Methods and problems, Proceedings of SimVis’06, Magdeburg, Germany, 2-3 March.

Hengl T, 2003, Visualisation of uncertainty using the HSI colour model: Computations with colours, Proceedings of the 7th International Conference on GeoComputation (CD-Rom), Southampton, 8-10 September, 8-17.

Hunter G and M Goodchild, 1993, Managing uncertainty in spatial databases: Putting theory into practise, Proceedings of Urban and Regional Information Systems Association, Atlanta, Georgia, 25-29 July, 1-14.

Jiang B, W Kainz and F Ormeling, 1995, A modified HLS color system used in visualization of uncertainty, Proceedings of GeoInformatics ’95 Hong Kong, International Symposium on Remote Sensing, Geographic Information Systems & Global Positioning Systems in Sustainable Development and Environmental Monitoring, The Chinese University of Hong Kong, Hong Kong, Vol 2, 702-712.

Krygier J, 1994, Sound and geographic visualization, in: A MacEachren and D Taylor (eds), Visualization in Modern Cartography, Elsevier, Oxford, 149-166.

Leyk S, R Boesch and R Weibel, 2005, A conceptual framework for uncertainty investigation in map-based land cover change modelling, Transactions in GIS, 9 (3), 291- 322. Lucieer A and M-J Kraak, 2004, Interactive and visual fuzzy classification of remotely sensed imagery for exploration of uncertainty, International Journal of Geographical Information Science, 18 (5), 491-512.

MacEachren A, 1992, Visualising uncertain information, Cartographic Perspectives, 13, 10-19.

MacEachren A, 1994, Some Truth with Maps: A Primer on Symbolisation and Design, Association of American Geographers, Washington, District of Columbia.

MacEachren A, in collaboration with B Buttenfield, J Campbell, D DiBiase and M Monmonier, 1992, Visualisation, in: R Abler, M Marcus and J Olson (eds), Geography’s Inner Worlds, Rutgers University Press, New Brunswick, New Jersey, 99-137.

MacEachren A, D Howard, T Taormino, D Askov and M von Wyss, 1993, Visualising the health of the Chesapeake Bay: An uncertainty endeavor, Proceedings of GIS/LIS ’93, Minneapolis, Minnesota, 449-458.

MacEachren A, A Robinson, S Hopper, S Gardner, R Murray, M Gahegan and E Hetzler, 2005, Visualising geospatial information uncertainty: What we know and what we need to know, Cartography and Geographic Information Science, 32 (3), 139-160.

McGranaghan M, 1993, A cartographic view of spatial data quality, Cartographica, 30 (2&3), 8-19.

NIST (National Institute of Standards and Technology), 1992, Federal Information Processing Standard, Publication 173 (Spatial Data Transfer Standard), US Department of Commerce, Washington, District of Columbia.

Pang A, 2001, Visualizing uncertainty in geo-spatial data, Proceedings of the Workshop on the Intersections between Geospatial Information and Information Technology, National Academies Committee of the Computer Science and Telecommunications Board, Arlington, Virginia.

Slocum T, D Cliburn, J Feddema and J Miller, 2003, Evaluating the usability of a tool for visualizing the uncertainty of the future global water balance, Cartography and Geographic Information Science, 30 (4), 299-317. Thomson J, B Hetzler, A MacEachren, M Gahegan and M Pavel, 2005, A typology for visualising uncertainty, Proceedings of Visualisation and Data Analysis 2005, IS&T/SPIE Symposium on Electronic Imaging 2005, San Jose, California.

van der Wel F, R Hootsmans and F Ormeling, 1994, Visualisation of data quality, in: A MacEachren and D Taylor (eds), Visualisation in Modern Cartography, Elsevier, Oxford, 313-331.

Walsh S, D Lightfoot and D Butler, 1987, Recognition and assessment of error in geographic information systems, Photogrammetric Engineering & Remote Sensing, 53 (10), 1423-1430.