CSE 5544: Introduction to Data Raghu Machiraju [email protected]

[xkcd] The Visualization Alphabet: Marks and Channels How can I visually represent two numbers, e.g., 4 and 8 How can I visually represent two concepts, e.g., well being and feeling rich … Marks & Channels

Marks: represent items or links Channels: change appearance based on attribute Channel = Visualization Families Marks for Items

Basic geometric elements

0D 1D 2D 3D mark: Volume, but rarely used Marks for Links

Containment Connection Containment - can be nested

[Riche & Dwyer, 2010] Channels (aka Visual Variables)

Control appearance proportional to or based on attributes

French cartographer [1918-2010] Semiology of Graphics [1967] Theoretical principles for visual encodings Bertin’s Visual Variables Marks: Points Lines Areas

Position Size (Grey)Value

Texture Orientation Shape

Semiology of Graphics [J. Bertin, 67] Image

Visual language is a sign system

Images perceived as a set of signs

Visual Language - System of Signs Sender encodes information in signs

Receiver decodes information from signs Jacques Bertin

• Image perceived as a set of signs Semiology of Graphics, 1983

• Sender encodes information in signs

• Receiver decodes information from signs

13

Semiology of Graphics [J. Bertin, 83] Stolte / Hanrahan

“Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases”, Chris Stolte and Channel Types: Where / What

Based on slide from Mazur Using Marks and Channels

Mark: Line Mark: Point Adding Adding Size Channel: Length/Position Channel: Position +1 categorical attr. +1 quantitative attr. 1 quantitative attribute 2 quantitative attr. 1 categorical attribute Redundant Encoding

Length, Position and Value Good bar ?

Rule: Use channel proportional to data! Types of Channels

Magnitude Channels Identity Channels How much? What? Where? Position Shape Length Color (hue) Saturation … Spatial region …

Ordinal & Quantitative Data Categorical Data What Visual Variables?

http://www.nytimes.com/interactive/2013/05/25/sunday-review/corporate-taxes.html What Visual Variables ? Characteristics of Channels

Selective Is a mark distinct from other marks? Can we make out the difference between two marks? Associative Does it support grouping? Quantitative (Magnitude vs Identity Channels) Can we quantify the difference between two marks? Characteristics of Channels

Order (Magnitude vs Identity) Can we see a change in order? Length How many unique marks can we make? Expressiveness + Effectiveness

• Expressiveness: – visual encoding should express all of, and only, the information in the dataset attributes – simple one - lie factor • Effectiveness: – importance of the attribute should match the salience of the channel – simple one - data-ink ratio Effectiveness of Mappings - Complex

[Kandel et al.: Principles of neural science] Channels: Expressiveness Types and Efectiveness Ranks

Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Position on common scale Spatial region

Position on unaligned scale Color hue

Length (1D size) Motion

Tilt/angle Shape

Area (2D size)

Depth (3D position)

Color luminance

Color saturation

Curvature

Volume (3D size) Spatial Location - Position Spatial position

2.05D Position

Strongest visual variable Suitable for all data types Problems: Selective: yes Sometimes not available (spatial Associative: yes data) Quantitative: yes Cluttering Order: yes Length: fairly big Example: Scatterplot Position in 3D?

[Spotfire] Length & Size

Good for 1D, OK for 2D, Bad for 3D Easy to see whether one is bigger Aligned bars use position redundantly For 1D length: Selective: yes Associative: yes Quantitative: yes Order: yes Length: high Example 2D Size: Bubbles Color 36 ????? Color < <

Good for qualitative data (identity channel) Selective: yes Associative: yes Limited number of classes/length (~7-10!) Quantitative: no Does not work for quantitative data! Order: no Lots of pitfalls! Be careful! Length: limited My rule: minimize color use for encoding data use for brushing Human Visual System The Eye & Retina Retina Detectors

• 1 type of monochrome sensor (rods) – Important at low light • Next level: lots of specialized cells – Detect edges, corners, etc. • Sensitive to contrast – Weber’s law: DL ~ L Retina Detectors

3 types of color sensors - S, M, L (cones) Works for bright light Peak sensitivities located at approx. 430nm, 560nm, and 610nm for "average" observer. Roughly equivalent to blue, green, and red sensors Cone Response

HyperPhysics, Georgia State University Color Models RGB

White - Additive system Cyan Yellow

- that can be represented by Green computer monitors Blue - Not perceptually uniform Red

Black

C. Ware, “Visual Thinking for Design” HSL Color Space

Hue - what people think of color

Saturation - purity, distance from grey

Lightness - from dark to light wikipedia.org

Not perceptually uniform

Thanks to Moritz Wustinger Lab Color Space

Perceptually uniform

L approximates human of a, b approximate R/G and Y/B channels a, b called chroma

CIELAB 1976 Perceptual Color Spaces

Lightness

Colorfulness

Hue

Unique black and white Courtesy of Maureen Stone Munsell Color

Hue, Value, Chroma Value 5 R 5/10 (bright red) N 8 (light gray)

Perceptually uniform Chroma Hue

Courtesy of Maureen Stone Munsell Atlas

Courtesy Gretag-Macbeth Interactive Munsell Tool

From www.munsell.com

Courtesy of Maureen Stone Another Model - Color Deficiency Color Opponency

Red - Green: Difference between R and G

Luminance (L): Combination of R and G

Yellow - Blue: Difference between L and B

C. Ware, “Visual Thinking for Design” Source: M. Stone Model “Color blindness”

Flaw in opponent processing Red-green common (deuteranope, protanope) Blue-yellow possible (tritanope -- most common) Luminance channel almost “normal”

8% of all men, 0.5% of all women

Effect is 2D color vision model Flatten color space Can be simulated (Brettel et. al.) http://colorfilter.wickline.org http://www.colblindor.com/coblis-color-blindness-simulator/ Color Blindness

Simulate color vision deficiencies colorfilter.wickline.org www.vischeck.com

Protanope Deuteranope Tritanope No L cones No M cones No S cones Red / green Blue / Yellow Source: M. Stone deficiencies deficiency Color-Blindness

Normal Protanope Deuteranope Lightness

Source: M. Stone Luminance, Saturation, Hue

Luminance How-much channel discriminability: ~2-4 bins contrast important Saturation How-much channel discriminability: ~3 bins Hue What channel discriminability: ~6-12 Value/Luminance/Saturation

OK for quantitative data when length & size are used. Not very many shades recognizable

Selective: yes Associative: yes Quantitative: somewhat (with problems) Order: yes Length: limited Example: Diverging Value-Scale Color: Bad Example

Cliff Mass Color: Good Example Shape Shape ????? < < Great to recognize many classes. No grouping, ordering.

Selective: yes Associative: limited Quantitative: no Order: no Length: vast

Chernoff Faces

Idea: use facial parameters to quantitative data

Does it work? Not really!

Critique: https://eagereyes.org/criticism/chernoff-faces Glyphs

• Glyphs and icons – Consist of several components • Features should be easy to distinguish and combine • Icons should be separated from each other • Mainly used for multivariate discrete data

68 Glyphs

• Color icons [Levkowitz 91] • Subdivision of a basic figure (triangle, square, …) into edges and faces • Mapping of data to faces via color tables • Grouping by emphasizing edges or faces

69 Glyphs

Stick-figure icon [Picket & Grinstein 88] 2D figure with 4 limbs Coding of data via Length Thickness Angle with vertical axis 12 attributes Exploits the human capability to recognize patterns/textures Using Stick Figure Icons

71 Glyphs

Circular icon plots: Star plots Sun ray plots Follow a "spoked wheel" format Values of variables are represented by distances between the center ("hub") of the icon and its edges Glyphs

Star glyphs [S. E. Fienberg: Graphical methods in statistics. The American Statistician, 33:165-178, 1979] – A star is composed of equally spaced radii, stemming from the center – The length of the spike is proportional to the value of the respective attribute – The first spike/attribute is to the right – Subsequent spikes are counter-clockwise – The ends of the rays are connected by a line 73 to Texture

Main parameters for texture Orientation Size Contrast Alternatively [Tamura 78]: Coarseness Roughness Contrast Directionality Line-likeness Regularity [C. Ware, Information Visualization] Ware, [C. Mapping to Texture

Goal: Avoid visual "crosstalk“ “Orthogonal” perceptual channels Restricts range of parameters E.g. approximately 30 degrees difference in orientation needed to distinguish textures Main application for textures: nominal data Some applications for direct visualization of orientations Mapping to Texture

Generate texture Gabor func. as primitives Parameters:

Orientation Visualization] Information Ware, [C. Size Contrast Visualization of a magnetic field Randomly splatter down Gabor functions Blending yields continuous coverage Stochastic texture model Mapping to Texture

Other stochastic texture models: LIC (Line integral convolution) for vector field visualization Structural models Procedural description of texture generation E.g. Lindenmayer systems (L- systems) More Channels Other Mappings

• More advanced mappings possible • Examples for other visual variables – Motion – Blink coding – Explicit use of 3D • Multiple attributes • Typical combination of attributes: – Geometric position, e.g., height field – Color: saturation, intensity, tone – Texture • Issue: Interference? Other Mappings

Visual variable Dimensionality Spatial position of glyph 3 dimensions: X, Y, Z Color of glyph 3 dimensions: defined by color opponent theory Shape 2–3? dimensions unknown Orientation 3 dimensions: corresponding to orientation about each of the primary axes Surface texture 3 dimensions: orientation, size, and contrast Motion coding 2–3? Dimensions largely unknown, but phase may be useful Blink coding: The glyph 1 dimension blinks on and off at some rate Theory Why Quantitative Channels Different?

S = sensation I = intensity Steven’s Power Law, 1961

Electric

From Wilkinson 99, based on Stevens 61 Channels: Expressiveness Types and Efectiveness Ranks

Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Position on common scale Spatial region

Position on unaligned scale Color hue

Length (1D size) Motion

Tilt/angle Shape

Area (2D size)

Depth (3D position)

Color luminance

Color saturation

Curvature

Volume (3D size) How much longer?

A 2x B How much longer?

A 4x B How much steeper?

~4x

A B How much larger (area)?

5x

A B How much larger (area)?

3x

A B How much larger (diameter)?

2x

A B How much darker?

2x

A B How much darker?

3x

A B Position, Length & Angle Channels: Expressiveness Types and Efectiveness Ranks

Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Position on common scale Spatial region

Position on unaligned scale Color hue

Length (1D size) Motion

Tilt/angle Shape

Area (2D size)

Depth (3D position)

Color luminance

Color saturation

Curvature

Volume (3D size) Relative vs. Absolute Weber’s law says that everything is relative, i.e. the “intensity” depends on the background signal Luminance perception is based on relative, not absolute, judgements. (a) The two squares A and B appear quite different. (b) Superimposing a gray mask on the image shows that they are in fact identical. Color perception is also relative to surrounding colors and depends on context. (a) Both cubes have tiles that appear to be red. (b) Masking the intervening context shows that the colors are very different: with yellow apparent lighting, they are orange; with blue apparent lighting, they are purple. Notice that midpoint of scales appears near value of background Other Factors Affecting Accuracy

Alignment Distractors

B B A B Distance A A Unframed Framed Unframed Unaligned Unaligned Aligned Common scale Weber's Law states that we judge based on relative, not absolute differences. (a) The lengths of unframed, unaligned rectangles of slightly different sizes are hard to compare. (b) Adding a frame allows us to compare the very different … sizes of the unfilled rectangles between the bar and frame tops. (c) Aligning the bars also makes the judgement easy

VS VS VS Cleveland / McGill, 1984

William S. Cleveland; Robert McGill , “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” 1984 Cleveland & McGill’s Results

Positions

1.0 1.5 2.0 2.5 3.0 Log Error

Crowdsourced Results

Angles

Circular areas

Rectangular areas (aligned or in a treemap)

1.0 1.5 2.0 2.5 3.0 Log Error Heer & Bostock, 2010 Jock Mackinlay, 1986 Decreasing

[Mackinlay, Automating the Design of Graphical Presentations of Relational Information, 1986] Yet Another Factor Effectiveness -- Discriminability • how many colors can I tell apart? • how many levels of grey etc. • What about line width ? Separability of Attributes

Can we combine multiple visual variables?

T. Munzner, Visualization Analysis and Design, 2014 Popout - Pre-attentive Processing

parallel (visual processing)

Visual popout. (a) The red circle pops out from a small set of blue circles. (b) The red circle pops out from a large set of blue circles just as quickly. (c) The red circle also pops out from a small set of square shapes, although a bit slower than with color. (d) The red circle also pops out of a large set of red squares. (e) The red circle does not take long to find from a small set of mixed shapes and colors. (f) The red circle does not pop out from a large set of red squares and blue circles, and it can only be found by searching one by one through all the objects. Popout - Pre-attentive Processing

parallel (visual processing) Grouping - containment - lines of connection - proximity - similarity More integral coding pairs Coding Pairs

• Integral display dimensions – Two or more attributes perceived holistically • Separable dimensions – Separate judgments about each graphical dimension

• Simplistic classification, with a large Information Visualization] [C. Ware, number of exceptions and asymmetries More separable coding pairs