Uncertainty-Aware Multidimensional Ensemble Data Visualization and Exploration
Total Page:16
File Type:pdf, Size:1020Kb
Uncertainty-aware Multidimensional Ensemble Data Visualization and Exploration Haidong Chen, Song Zhang, Senior Member, IEEE, Wei Chen, Member, IEEE, Honghui Mei, Jiawei Zhang, Andrew Mercer, Ronghua Liang, Member, IEEE, and Huamin Qu, Member, IEEE Abstract—This paper presents an efficient visualization and exploration approach for modeling and characterizing the relationships and uncertainties in the context of a multidimensional ensemble dataset. Its core is a novel dissimilarity-preserving projection technique that characterizes not only the relationships among the mean values of the ensemble data objects but also the relationships among the distributions of ensemble members. This uncertainty-aware projection scheme leads to an improved understanding of the intrinsic structure in an ensemble dataset. The analysis of the ensemble dataset is further augmented by a suite of visual encoding and exploration tools. Experimental results on both artificial and real-world datasets demonstrate the effectiveness of our approach. Index Terms—Ensemble visualization, uncertainty quantification, uncertainty visualization, multidimensional data visualization ! 1INTRODUCTION ture, pressure, and wind direction. Therefore, challenges for ECENT developments in scientific simulation research analyzing an ensemble dataset include both the uncertainty R have resulted in an increased role for scientific sim- with respect to each output variable and the high dimension- ulations and analysis. One common way to study uncer- ality of the dataset. tainty is the ensemble simulation, which employs stochastic There have been a large number of uncertainty visu- initial conditions or multiple parameterizations to produce alization and analysis methods [2] that go beyond tradi- an ensemble of simulation outcome. For example, in the tional summary statistics [3] and depict ensemble data in numerical weather simulation, both initial conditions and more detail. Conventional visualization solutions exploit simulation parameters (e.g., cumulus schemes and micro- glyphs [4], [5], [6] and visual variables [7], [8] to encode physics schemes) can be perturbed in an ensemble forecast uncertainties. However, most of them are only designed for simulation. The number of ensemble runs ranges from 1D or 2D datasets and have limited capabilities to reveal the dozens to thousands. In this paper, all ensemble members intrinsic structures in the ensemble dataset. Recent methods of a single data entry are called an ensemble data object, can effectively characterize and analyze the uncertainty e.g., the ensemble numerical weather forecast at a geospatial structures [9], [10] and forms [11], [12] but are designed for location. The ensemble members of an ensemble data object data objects with one variable. may be averaged to obtain the ensemble mean, which is To address the second challenge, high dimensionality, regarded as a representative of the ensemble members. Un- multidimensional projection techniques [13], [14] are widely fortunately, important information is lost in the averaging used to build a low-dimensional layout that respects the process. It is highly beneficial to characterize the uncertainty distances among data objects in the high-dimensional space. of an ensemble dataset during the entire analysis process. In this low-dimensional layout, closely positioned points In most scientific applications, the resulting simulation indicate similar data objects in the high-dimensional space. output is a multivariate field. For instance, the output of However, most conventional projection methods are de- a typical Weather Research and Forecasting (WRF) simula- veloped for datasets in which a data object has only one tion [1] includes more than 100 variables such as tempera- instance. Simply using the ensemble mean as an instance to represent a data object for projection suffers from heavy • Haidong Chen, Wei Chen, Honghui Mei, and Jiawei Zhang information loss, because the shape of distribution for each are with the State Key Lab of CAD & CG, Zhejiang Uni- ensemble data object is lost during the averaging process. versity, CHINA, 310058. E-mail: [email protected], chen- A conceptual example is shown in Fig. 1 (a,b), where four [email protected], [email protected], and [email protected]. • Wei Chen is the corresponding author. He is also with the Cyber Innova- ensemble data objects with similar ensemble means but tion Joint Research Center, Zhejiang University. different ensemble distributions are lumped together by • Song Zhang is with the Department of Computer Science and a conventional multidimensional projection method to the Engineering, Mississippi State University, U.S., 39762. E-mail: [email protected]. ensemble means. This can cause misleading perceptions • Andrew Mercer is with the Geosciences Department, Mississippi State from users. For example, users might advocate that these University, U.S., 39762. E-mail: [email protected]. four data objects are almost the same as they are located • Ronghua Liang is with the College of Information Engineering, Zhejiang close to each other. University of Technology, CHINA, 310014. E-mail: [email protected]. • Huamin Qu is with the Department of Computer Science and En- It remains a challenging task to visually explore the gineering, Hong Kong University of Science and Technology. E-mail: intrinsic structures of a high dimensional ensemble dataset [email protected]. as well as the uncertainty of each individual ensemble Digital Object Identifier no. 10.1109/TVCG.2015.2410278 data object. The key contribution of this paper is a novel 1077-2626 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 2 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS uncertainty-aware multidimensional projection approach • A novel uncertainty-aware multidimensional projec- that generalizes the conventional projection methods for tion technique for ensemble datasets. ensemble datasets. Its core is a new dissimilarity measure of • A multi-view interactive visualization system for the ensemble data objects and an enhanced Laplacian-based effective exploration of intrinsic structures and un- projection scheme. Compared with the conventional projec- certainties in ensemble datasets. tion methods that can solely respect the distances among ensemble means, our dissimilarity measure preserves the relationships among the ensemble distributions as well as 2RELATED WORK the ensemble means. By setting the distributional difference Our work relates to multiple topics of visualization includ- weight to zero (see Equation (7)), our approach regresses to ing uncertainty visualization, ensemble data visualization, a conventional projection scheme. The enhanced Laplacian- and multidimensional/multivariate data visualization. based projection scheme achieves a balance between the Uncertainty spreads throughout the entire data analysis local and global point layout by imposing global constraints pipeline including acquisition, transformation, and visual- in constructing the Laplacian system. ization [15], [16]. Depicting uncertainty can significantly Fig. 1 (c) illustrates the effectiveness of our uncertainty- help analysts make better decisions [17]. Thomson et al. [18] aware projection approach, with which four data objects proposed a typology for uncertainty visualization in intel- are positioned in the 2D visual plane. In this projection, U2 ligence analysis. Potter et al. [19] presented a comprehen- and U3 are located close to each other due to their similar sive survey for uncertainty quantification and visualization ensemble means and distributions. As U1 and U4 follow of scientific data. In the past decade, a large number of different distributions from U2 and U3, they are positioned uncertainty visualization techniques have been developed. far away from U2 and U3 even though all of them have They can be roughly classified into four categories: glyph similar ensemble means. based, visual variable based, geometry based, and anima- tion based. Glyph based methods encode uncertainties into well-designed glyphs (e.g., the flow radar glyph [4], the U U U U 3 1 circular glyph [20], the summary plot [5], and the uncertain 1 2 U4 ODF glyph [21]) and place them into the original data field. U (b) 2 Similarly, visual variables such as color [10], [22], bright- U U U3 2 4 ness [8], [23], blurriness [24], and texture [7], [25] can also U 3 U be employed to encode uncertainty. The geometry based U 1 4 approaches are a family of techniques that adapt the basic (a) (c) geometry to represent uncertainty, including point [22], line [26], cube [27], and surrounding volume [10], [28]. Fig. 1. Projecting an example 1D ensemble dataset (a) which consists of four ensemble data objects with similar mean values but different Coninx et al. [29] and Lundstrom et al. [30] demonstrate distributions. (b) The result of a conventional multidimensional projection that animation can be used to express uncertainty as well. algorithm to the ensemble means. (c) The result of our method. In general, most of these techniques focus on 1D or 2D datasets and cannot be readily extended to multidimen- We note that as with any projection method, our method sional datasets. Wu et al. [31] introduced the standard error will not preserve the high-dimensional structures with 100% ellipsoid to characterize uncertainty arising in any stage of accuracy.