CSE 5544: Introduction to Data Visualization Raghu Machiraju [email protected]
[xkcd] The Visualization Alphabet: Marks and Channels How can I visually represent two numbers, e.g., 4 and 8 How can I visually represent two concepts, e.g., well being and feeling rich … Marks & Channels
Marks: represent items or links Channels: change appearance based on attribute Channel = Visual Variable Visualization Families Marks for Items
Basic geometric elements
0D 1D 2D 3D mark: Volume, but rarely used Marks for Links
Containment Connection Containment - can be nested
[Riche & Dwyer, 2010] Channels (aka Visual Variables)
Control appearance proportional to or based on attributes Jacques Bertin
French cartographer [1918-2010] Semiology of Graphics [1967] Theoretical principles for visual encodings Bertin’s Visual Variables Marks: Points Lines Areas
Position Size (Grey)Value
Texture Color Orientation Shape
Semiology of Graphics [J. Bertin, 67] Image
Visual language is a sign system
Images perceived as a set of signs
Visual Language - System of Signs Sender encodes information in signs
Receiver decodes information from signs Jacques Bertin
• Image perceived as a set of signs Semiology of Graphics, 1983
• Sender encodes information in signs
• Receiver decodes information from signs
13
Semiology of Graphics [J. Bertin, 83] Stolte / Hanrahan
“Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases”, Chris Stolte and Pat Hanrahan Channel Types: Where / What
Based on slide from Mazur Using Marks and Channels
Mark: Line Mark: Point Adding Hue Adding Size Channel: Length/Position Channel: Position +1 categorical attr. +1 quantitative attr. 1 quantitative attribute 2 quantitative attr. 1 categorical attribute Redundant Encoding
Length, Position and Value Good bar chart?
Rule: Use channel proportional to data! Types of Channels
Magnitude Channels Identity Channels How much? What? Where? Position Shape Length Color (hue) Saturation … Spatial region …
Ordinal & Quantitative Data Categorical Data What Visual Variables?
http://www.nytimes.com/interactive/2013/05/25/sunday-review/corporate-taxes.html What Visual Variables ? Characteristics of Channels
Selective Is a mark distinct from other marks? Can we make out the difference between two marks? Associative Does it support grouping? Quantitative (Magnitude vs Identity Channels) Can we quantify the difference between two marks? Characteristics of Channels
Order (Magnitude vs Identity) Can we see a change in order? Length How many unique marks can we make? Expressiveness + Effectiveness
• Expressiveness: – visual encoding should express all of, and only, the information in the dataset attributes – simple one - lie factor • Effectiveness: – importance of the attribute should match the salience of the channel – simple one - data-ink ratio Effectiveness of Mappings - Complex
[Kandel et al.: Principles of neural science] Channels: Expressiveness Types and Efectiveness Ranks
Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Position on common scale Spatial region
Position on unaligned scale Color hue
Length (1D size) Motion
Tilt/angle Shape
Area (2D size)
Depth (3D position)
Color luminance
Color saturation
Curvature
Volume (3D size) Spatial Location - Position Spatial position
2.05D Position
Strongest visual variable Suitable for all data types Problems: Selective: yes Sometimes not available (spatial Associative: yes data) Quantitative: yes Cluttering Order: yes Length: fairly big Example: Scatterplot Position in 3D?
[Spotfire] Length & Size
Good for 1D, OK for 2D, Bad for 3D Easy to see whether one is bigger Aligned bars use position redundantly For 1D length: Selective: yes Associative: yes Quantitative: yes Order: yes Length: high Example 2D Size: Bubbles Color 36 ????? Color < <
Good for qualitative data (identity channel) Selective: yes Associative: yes Limited number of classes/length (~7-10!) Quantitative: no Does not work for quantitative data! Order: no Lots of pitfalls! Be careful! Length: limited My rule: minimize color use for encoding data use for brushing Human Visual System The Eye & Retina Retina Detectors
• 1 type of monochrome sensor (rods) – Important at low light • Next level: lots of specialized cells – Detect edges, corners, etc. • Sensitive to contrast – Weber’s law: DL ~ L Retina Detectors
3 types of color sensors - S, M, L (cones) Works for bright light Peak sensitivities located at approx. 430nm, 560nm, and 610nm for "average" observer. Roughly equivalent to blue, green, and red sensors Cone Response
HyperPhysics, Georgia State University Color Models RGB Color Space
White - Additive system Cyan Yellow
- Colors that can be represented by Green computer monitors Blue - Not perceptually uniform Red
Black
C. Ware, “Visual Thinking for Design” HSL Color Space
Hue - what people think of color
Saturation - purity, distance from grey
Lightness - from dark to light wikipedia.org
Not perceptually uniform
Thanks to Moritz Wustinger Lab Color Space
Perceptually uniform
L approximates human perception of lightness a, b approximate R/G and Y/B channels a, b called chroma
CIELAB 1976 Perceptual Color Spaces
Lightness
Colorfulness
Hue
Unique black and white Courtesy of Maureen Stone Munsell Color
Hue, Value, Chroma Value 5 R 5/10 (bright red) N 8 (light gray)
Perceptually uniform Chroma Hue
Courtesy of Maureen Stone Munsell Atlas
Courtesy Gretag-Macbeth Interactive Munsell Tool
From www.munsell.com
Courtesy of Maureen Stone Another Model - Color Deficiency Color Opponency
Red - Green: Difference between R and G
Luminance (L): Combination of R and G
Yellow - Blue: Difference between L and B
C. Ware, “Visual Thinking for Design” Source: M. Stone Model “Color blindness”
Flaw in opponent processing Red-green common (deuteranope, protanope) Blue-yellow possible (tritanope -- most common) Luminance channel almost “normal”
8% of all men, 0.5% of all women
Effect is 2D color vision model Flatten color space Can be simulated (Brettel et. al.) http://colorfilter.wickline.org http://www.colblindor.com/coblis-color-blindness-simulator/ Color Blindness
Simulate color vision deficiencies colorfilter.wickline.org www.vischeck.com
Protanope Deuteranope Tritanope No L cones No M cones No S cones Red / green Blue / Yellow Source: M. Stone deficiencies deficiency Color-Blindness
Normal Protanope Deuteranope Lightness
Source: M. Stone Luminance, Saturation, Hue
Luminance How-much channel discriminability: ~2-4 bins contrast important Saturation How-much channel discriminability: ~3 bins Hue What channel discriminability: ~6-12 Value/Luminance/Saturation
OK for quantitative data when length & size are used. Not very many shades recognizable
Selective: yes Associative: yes Quantitative: somewhat (with problems) Order: yes Length: limited Example: Diverging Value-Scale Color: Bad Example
Cliff Mass Color: Good Example Shape Shape ????? < < Great to recognize many classes. No grouping, ordering.
Selective: yes Associative: limited Quantitative: no Order: no Length: vast
Chernoff Faces
Idea: use facial parameters to map quantitative data
Does it work? Not really!
Critique: https://eagereyes.org/criticism/chernoff-faces Glyphs
• Glyphs and icons – Consist of several components • Features should be easy to distinguish and combine • Icons should be separated from each other • Mainly used for multivariate discrete data
68 Glyphs
• Color icons [Levkowitz 91] • Subdivision of a basic figure (triangle, square, …) into edges and faces • Mapping of data to faces via color tables • Grouping by emphasizing edges or faces
69 Glyphs
Stick-figure icon [Picket & Grinstein 88] 2D figure with 4 limbs Coding of data via Length Thickness Angle with vertical axis 12 attributes Exploits the human capability to recognize patterns/textures Using Stick Figure Icons
71 Glyphs
Circular icon plots: Star plots Sun ray plots Follow a "spoked wheel" format Values of variables are represented by distances between the center ("hub") of the icon and its edges Glyphs
Star glyphs [S. E. Fienberg: Graphical methods in statistics. The American Statistician, 33:165-178, 1979] – A star is composed of equally spaced radii, stemming from the center – The length of the spike is proportional to the value of the respective attribute – The first spike/attribute is to the right – Subsequent spikes are counter-clockwise – The ends of the rays are connected by a line 73 Texture Mapping to Texture
Main parameters for texture Orientation Size Contrast Alternatively [Tamura 78]: Coarseness Roughness Contrast Directionality Line-likeness Regularity [C. Ware, Information Visualization] Ware, [C. Mapping to Texture
Goal: Avoid visual "crosstalk“ “Orthogonal” perceptual channels Restricts range of parameters E.g. approximately 30 degrees difference in orientation needed to distinguish textures Main application for textures: nominal data Some applications for direct visualization of orientations Mapping to Texture
Generate texture Gabor func. as primitives Parameters:
Orientation Visualization] Information Ware, [C. Size Contrast Visualization of a magnetic field Randomly splatter down Gabor functions Blending yields continuous coverage Stochastic texture model Mapping to Texture
Other stochastic texture models: LIC (Line integral convolution) for vector field visualization Structural models Procedural description of texture generation E.g. Lindenmayer systems (L- systems) More Channels Other Mappings
• More advanced mappings possible • Examples for other visual variables – Motion – Blink coding – Explicit use of 3D • Multiple attributes • Typical combination of attributes: – Geometric position, e.g., height field – Color: saturation, intensity, tone – Texture • Issue: Interference? Other Mappings
Visual variable Dimensionality Spatial position of glyph 3 dimensions: X, Y, Z Color of glyph 3 dimensions: defined by color opponent theory Shape 2–3? dimensions unknown Orientation 3 dimensions: corresponding to orientation about each of the primary axes Surface texture 3 dimensions: orientation, size, and contrast Motion coding 2–3? Dimensions largely unknown, but phase may be useful Blink coding: The glyph 1 dimension blinks on and off at some rate Theory Why Quantitative Channels Different?
S = sensation I = intensity Steven’s Power Law, 1961
Electric
From Wilkinson 99, based on Stevens 61 Channels: Expressiveness Types and Efectiveness Ranks
Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Position on common scale Spatial region
Position on unaligned scale Color hue
Length (1D size) Motion
Tilt/angle Shape
Area (2D size)
Depth (3D position)
Color luminance
Color saturation
Curvature
Volume (3D size) How much longer?
A 2x B How much longer?
A 4x B How much steeper?
~4x
A B How much larger (area)?
5x
A B How much larger (area)?
3x
A B How much larger (diameter)?
2x
A B How much darker?
2x
A B How much darker?
3x
A B Position, Length & Angle Channels: Expressiveness Types and Efectiveness Ranks
Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Position on common scale Spatial region
Position on unaligned scale Color hue
Length (1D size) Motion
Tilt/angle Shape
Area (2D size)
Depth (3D position)
Color luminance
Color saturation
Curvature
Volume (3D size) Relative vs. Absolute Weber’s law says that everything is relative, i.e. the “intensity” depends on the background signal Luminance perception is based on relative, not absolute, judgements. (a) The two squares A and B appear quite different. (b) Superimposing a gray mask on the image shows that they are in fact identical. Color perception is also relative to surrounding colors and depends on context. (a) Both cubes have tiles that appear to be red. (b) Masking the intervening context shows that the colors are very different: with yellow apparent lighting, they are orange; with blue apparent lighting, they are purple. Notice that midpoint of scales appears near value of background Other Factors Affecting Accuracy
Alignment Distractors
B B A B Distance A A Unframed Framed Unframed Unaligned Unaligned Aligned Common scale Weber's Law states that we judge based on relative, not absolute differences. (a) The lengths of unframed, unaligned rectangles of slightly different sizes are hard to compare. (b) Adding a frame allows us to compare the very different … sizes of the unfilled rectangles between the bar and frame tops. (c) Aligning the bars also makes the judgement easy
VS VS VS Cleveland / McGill, 1984
William S. Cleveland; Robert McGill , “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” 1984 Cleveland & McGill’s Results
Positions
1.0 1.5 2.0 2.5 3.0 Log Error
Crowdsourced Results
Angles
Circular areas
Rectangular areas (aligned or in a treemap)
1.0 1.5 2.0 2.5 3.0 Log Error Heer & Bostock, 2010 Jock Mackinlay, 1986 Decreasing
[Mackinlay, Automating the Design of Graphical Presentations of Relational Information, 1986] Yet Another Factor Effectiveness -- Discriminability • how many colors can I tell apart? • how many levels of grey etc. • What about line width ? Separability of Attributes
Can we combine multiple visual variables?
T. Munzner, Visualization Analysis and Design, 2014 Popout - Pre-attentive Processing
parallel (visual processing)
Visual popout. (a) The red circle pops out from a small set of blue circles. (b) The red circle pops out from a large set of blue circles just as quickly. (c) The red circle also pops out from a small set of square shapes, although a bit slower than with color. (d) The red circle also pops out of a large set of red squares. (e) The red circle does not take long to find from a small set of mixed shapes and colors. (f) The red circle does not pop out from a large set of red squares and blue circles, and it can only be found by searching one by one through all the objects. Popout - Pre-attentive Processing
parallel (visual processing) Grouping - containment - lines of connection - proximity - similarity More integral coding pairs Coding Pairs
• Integral display dimensions – Two or more attributes perceived holistically • Separable dimensions – Separate judgments about each graphical dimension
• Simplistic classification, with a large Information Visualization] [C. Ware, number of exceptions and asymmetries More separable coding pairs