Colormap : a Data-Driven Approach and Tool for Mapping Multivariate
Total Page:16
File Type:pdf, Size:1020Kb
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 24, NO. X, XXXXX 2018 1 ND 1 ColorMap : A Data-Driven Approach and Tool 2 for Mapping Multivariate Data to Color 3 Shenghui Cheng, Wei Xu, Member, IEEE, and Klaus Mueller, Senior Member, IEEE 4 Abstract—A wide variety of color schemes have been devised for mapping scalar data to color. We address the challenge of 5 color-mapping multivariate data. While a number of methods can map low-dimensional data to color, for example, using bilinear 6 or barycentric interpolation for two or three variables, these methods do not scale to higher data dimensions. Likewise, schemes that 7 take a more artistic approach through color mixing and the like also face limits when it comes to the number of variables they can 8 encode. Our approach does not have these limitations. It is data driven in that it determines a proper and consistent color map from first 9 embedding the data samples into a circular interactive multivariate color mapping display (ICD) and then fusing this display with a 10 convex (CIE HCL) color space. The variables (data attributes) are arranged in terms of their similarity and mapped to the ICD’s 11 boundary to control the embedding. Using this layout, the color of a multivariate data sample is then obtained via modified generalized 12 barycentric coordinate interpolation of the map. The system we devised has facilities for contrast and feature enhancement, supports 13 both regular and irregular grids, can deal with multi-field as well as multispectral data, and can produce heat maps, choropleth maps, 14 and diagrams such as scatterplots. 15 Index Terms—Multivariate data, color mapping, color space, high dimensional data, pseudo coloring Ç 16 1INTRODUCTION 17 APPING data to color has a rich history and several A common practice is to visualize multivariate data as 40 18 Mwell-tested color schemes have emerged (e.g., [1], [6], multiple images where each channel is mapped to a sepa- 41 19 [35]). Most of these, however, are defined for scalar data rate plot with a simple color scale. Fig. 1d shows such an 42 20 where a scalar value indexes a one-dimensional table that arrangement for four scalar images. However, a disjoint dis- 43 21 returns an RGB color triple. Other schemes assign colors to play of this nature makes it difficult to recognize correla- 44 22 different, usually disjoint materials and then use standard tions (or a lack thereof) that may exist among the different 45 23 blending functions to handle areas where materials overlap channels (variables) in the image. 46 24 or mix together. The latter often occurs in the graphical ren- For this reason, we wish to fuse the individual images 47 25 dering of simulations or imaged data, while the former is into a single multi-color image. Correlations can then be 48 26 frequently encountered in pseudo-coloring for heat maps or easily perceived by similarity of color, while dissimilarities 49 27 choropleth maps. become apparent by color variations. At the same time we 50 28 In this paper, we are interested in colorizing multivariate can use the color as a label to reveal which of the factors 51 29 data. Here we mainly focus on numerical data (categorical dominate or co-exist in certain areas. Essentially, we retain 52 30 data can be converted into numerical data [34]). These types color as a visual representation of the relative strength of a 53 31 of multivariate data occur frequently in many applications, given variable for each pixel in the image. 54 32 such as demographic assessments, environmental monitor- One way to achieve this fusion is by interpolation or 55 33 ing, scientific simulations, imaging, business [10] and others. blending. Let us assume we have n 3 variables. Then 56 34 The domain can be a geographic map, an image, or a volume. each variable is assigned to one of n primary colors, and a 57 35 They are a subset of multi-field data which also include multi- mapped color is produced via bilinear (for n ¼ 2 variables) 58 36 channel, multi-attribute, multi-modal, and multi-material or barycentric (for n ¼ 3 variables) interpolation [36]. Alter- 59 37 data, among others. Visualizing these types of data in their natively, we can assign each variable to one of a monitor’s 60 38 native domain remains challenging, and there is so far little three (RGB) primaries and blend the three variables directly 61 39 support to map these data vectors directly into color. in hardware into an RGB image. 62 One drawback of this concept is that it is difficult to 63 extend to n>3. Hardware blending is infeasible since 64 The authors are with the Visual Analytics and Imaging Lab at the Computer monitors typically only have three primary colors. Con- 65 Science Department, Stony Brook University, Stony Brook, NY 11794 and versely, interpolation could be realized using advanced 66 with the Computational Science Initiative, Brookhaven National Lab, Upton, NY 11973-5000. E-mail: {shecheng, wxu, mueller}@cs.stonybrook.edu. schemes like generalized barycentric interpolation [25]. A 67 Manuscript received 23 Jan. 2017; revised 11 Feb. 2018; accepted 19 Feb. 2018. severe drawback of interpolation and blending is that they 68 Date of publication 0 . 0000; date of current version 0 . 0000. do not yield a perceptually uniform result. Both map the 69 (Corresponding author: Shenghui Cheng.) data into an RGB color cube which is not a perceptual color 70 Recommended for acceptance by I. Hotz. space. It gives rise to the rainbow color map which renders 71 For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference the Digital Object Identifier below. some value differentials invisible while overly emphasizing 72 Digital Object Identifier no. 10.1109/TVCG.2018.2808489 others [4], [31]. This is not the case for the established 1D 73 1077-2626 ß 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html_ for more information. 2 IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 24, NO. X, XXXXX 2018 Fig. 1. System interface with all major displays and components (using the battery data, see Section 6.2 for more detail). Users can select a multivar- iate data point in any of these displays via mouse click. The system responds by highlighting the selected data point with a small circle both in the tar- geted display as well as in the other, synched displays (see arrows, added for illustration). (a) Integrated CIE HCL (Hue Chroma Luminance) interactive multivariate color mapping display (ICD, top) with control panel (middle), and the selected point’s multivariate spectrum display (bottom). (b) Multi-field / hyperspectral image, pseudo-colored via the multivariate color map in (a). (c) Locally enhanced colorization of the selected rectangu- lar region in (b). (d) Individual scalar images (usually displayed on the bottom of the interface in a channel view partition) colorized via the attribute- linked color primaries marked and labeled at the circle boundary of the multivariate color map in (a). The image in (b) constitutes a joint colorization of these individual channel images. 74 color maps which are the result of psycho-physical experi- known here is the IBM PRAVDA system [2]. In addition, a 113 75 ments and are perceptually uniform. prominent guide is also the Color Brewer [6] which presents a 114 76 The system we have devised combines a multivariate data variety of color schemes for cartography applications, broken 115 77 embedding scheme [7] inspired by generalized barycentric down into sequential, diverging, and qualitative schemes. For 116 78 interpolation with a perceptually uniform colorspace, CIE the former two schemes the site suggests decompositions into 117 79 HCL. The teaser image of Fig. 1 gives an overview of our up to 9 elements. More could be obtained via interpolation, 118 80 approach by ways of an example. Fig. 1d shows the four either piecewise linear to preserve the original elements or via 119 81 channel images we wish to fuse. Stacked up, each image higher-order functions. The Brewer schemes are highly 120 82 pixel is a 4D data point. We embed the data points into what respected and widely applied. According to the authors [16] 121 83 we call circular interactive multivariate color mapping display they were designed “using both experience and trial and 122 84 (ICD), shown in Fig. 1a. The attributes are arranged on the error”. Later, in more analytical research Wijffelaars et al. [35] 123 85 ICD’s boundary in terms of their similarity. Using the ICD, show that the Brewer palettes generally follow curved paths 124 86 the color of a multivariate data sample is then obtained via in the hue slices of the CIE LUV color space, but that the ele- 125 87 generalized barycentric coordinate interpolation. The gener- ments are not iso-distant from one another. The authors then 126 88 ated image (see Fig. 1b) clearly shows at what locations pix- describe an analytical tool by which lightness-ordered 127 89 els correlate and what the dominant factors are. palettes of any hue can be created and which follow optimally 128 90 Our paper is structured as follows. Section 2 presents lightness-sampled paths. 129 91 related work. Section 3 gives an overview of our tool and Choosing colors in CIE LUV color space is preferable 130 92 framework. Section 4 presents its basic features, while since it is perceptually uniform. Perceptual uniformity 131 93 Section 5 describes additional functionalities we developed means that any two equidistant colors elicit the same per- 132 94 in response to requirements we discovered during practical ceived color contrast in a human observer.