Quick viewing(Text Mode)

Data Visualization (DSC 530/CIS 602-02)

Data Visualization (DSC 530/CIS 602-02)

Data Visualization (DSC 530/CIS 602-02)

Maps

Dr. David Koop

D. Koop, DSC 530, Spring 2017 Quantifying the Space-Efficiency of 2D Graphical Representations of Trees

Michael J. McGuffin and Jean-Marc Robert

Abstract— A mathematical evaluation and comparison of the space-efficiency of various 2D graphical representations of tree struc- tures is presented. As part of the evaluation, a novel metric called the mean area exponent is introduced that quantifies the distribution of area across nodes in a tree representation, and that can be applied to a broad range of different representations of trees. Several representations are analyzed and compared by calculating their mean area exponent as well as the area they allocate to nodes and labels. Our analysis inspires a set of design guidelines as well as a few novel tree representations that are also presented. Index Terms—Tree visualization, graph drawing, efficiency metrics.

1INTRODUCTION A variety of graphical representations are available for depicting tree One basic metric of space-efficiency is the total area of a representa- structures (Figure 1), from “classical” node-link diagrams [23, 7], to tion. Assuming the representation is bound within a 1 1 square, both × treemaps [14, 26, 6, 30], concentric circles [2, 27, 31], and many others icicle diagrams and treemaps (Figures 1C and 1G) have a total area of (see [13] for a survey). A major consideration when designing, eval- 1, and are equally efficient (and both optimal) according to this met- uating, or comparing such representations is how efficiently they use ric. Likewise, concentric circles and nested circles (Figures 1E and 1F) screen space to show information about the tree. To date, however, it is both have a total area of π/4 0.785 (the area of a circle of diameter ≈ unclear how to go about evaluating space-efficiency in a way that can 1), and are also equally efficient according to the metric of total area. be applied to the large variety of tree representations and that enables a However, experience suggests that the representations within each of fair comparison of them. Space-efficiency might be described in terms these pairs do not scale equally well with larger, deeper trees. This of area, aspect ratio, label size, or other measures. However, there is no article shows that there are finer ways of distinguishing efficiency, i.e. accepted standard set of metrics for evaluating the space-efficiency of that there is more to space-efficiency than total area. tree representations, and it is unclear what approach would be general Treemaps are often described as optimally space-efficient, not just Tree Visualizationsenough to be applied to all the forms in Figure 1. because they have a total area of 1, but also because they allow for what we call a weighted partitioning of the area. Nodes can be allo- cated more or less area, depending on some attribute such as file size, population, or number of species, and furthermore this weighted par- titioning can be done without reducing the total area used. These are indeed desirable properties, however they are not unique to treemaps. Figure 2 shows that icicle diagrams also allow for a weighted parti- tioning of area, and incidentally have no need for margins between the borders of nodes as treemaps often do. Furthermore, although a weighted partitioning is useful for showing the relative sizes of nodes in Figures 2A and 2C, an unfortunate side effect is that labels on small nodes are very difficult to read. If users are more interested in seeing the identity of all nodes rather than their relative sizes, an alternative approach would be to give equal weight to each leaf node (Figures 2B and 2D), improving the overall legibility of nodes. (Although not shown in the figure, the labels could also be augmented to numerically show the “size” attribute of each node.) In terms of label size or legibility, Figures 2B and 2D are clearly prefer- able, but even they still result in much whitespace around certain la- bels, suggesting that a more space-efficient (in terms of label size) representation might be possible. Clearly, it would be useful to have some way to quantitatively dis- [McGuffin and Robert,tinguish 2010 the four] possibilities in Figure 2, e.g. in terms of their respec- tive scalability and the sizes of their labels. If total area is the only D. Koop, DSC 530,Fig. Spring 1. Several 2017 basic kinds of tree representations, here each showing metric of space-efficiency2 used, and “optimal” space-efficiency is de- acomplete3-arytreeofdepth3asanexample.Allrepresentations fined as a total area of 1 (possibly partitioned by weight), then we are drawn to just fit within a 1 1unitsquare. A:classical(layered) × have no way of distinguishing these four cases. If alternative metrics node-link [23, 7]. B:avariationonA,wheretheshapeofnodesbetter of space-efficiency are used, such as those investigated in this article, accommodates long labels. C:icicle. D:radial[10,9]. E:concentric it is not clear initially if treemaps, or any other representation, will still circles [2, 27, 31]. F:nestedcircles,similarto[5,28]. G:treemap[14, turn out to be optimal with respect to such alternative metrics. 26]. H:indentedoutline,sometimescalleda“treelist”,andcommon in file browsers such as Microsoft Explorer. This article identifies several metrics related to space-efficiency, and performs the first rigorous analysis and comparison of the space- efficiency of most of the basic tree representation styles in the infor- mation visualization literature, including all those in Figure 1. Some • Michael J. McGuffin is with Ecole´ de technologie superieure,´ Montreal,´ of the key ideas involved are (1) the use of a metric of the size of the Canada, E-mail: michael.mcguffi[email protected]. smallest nodes (i.e. the leaf nodes) in the representation, in addition to • Jean-Marc Robert is with Ecole´ de technologie superieure,´ Montreal,´ a metric of total area; (2) analyzing the area of labels on the nodes, Canada, E-mail: [email protected]. which implicitly takes into account both the size and aspect ratio of the nodes, measuring how much “useful” area they contain; and (3) analyzing how these metrics behave asymptotically, as the tree grows Treemap

[A. Cox and H. Fairfield, NYTimes, 2012]

D. Koop, DSC 530, Spring 2017 3 Treemap Layout Algorithms • Slice and Dice - Alternate direction as splits occur • Strip - Order rectangles and move to a new row when aspect ratio gets worse • Squarified - Split in both directions to get best aspect ratio

D. Koop, DSC 530, Spring 2017 4 Treemap Slice and Dice Layout

Top A MadStone BushFoo Luis Alice in Tool Chris DisturbedErykahAudioslaveSodaSinchDavePearlKrøm TempleNine Tryo SublimeRed Guns WaxRadioheadSoundgardenBlind Melon Incubus Jack Mother ThirteenthPerfect AboveSeasonTempleTiny TheEchoes,FightersAlbertoElija Y MusicChains Bank 10,000 Cornell2006-TheMama'sBadu AudioslaveComfortStereoClearingCrashMatthewsVs.Jam It All Templeof InchAndMamagubidaBest OneHot ChineseN' TailorTalesIn Down Nico Make JohnsonOn MotherLove CircleStep Music...Pilots ScienceSilence,SpinettaGane Days 09- SicknessGun y the Band Makes theOf NailsAll Of ChiliHot DemocracyRoses of RainbowsOn The Yourself And BoneLove Songs ofPatience 07: MúsicaChannel Sense DogThe That MinutePeppers the Upside On Bone From Things& O- Para Now Dog Could Forgotten the Grace Bar, Volar Have Melodies Vatican Stockholm, Been Gift Sweden (Still) Shop

Music Bank (disc 1) Soup

Reggae à Coup de Cirque

Mer De Core Brushfire Noms Music Bank Superunknown Fairytales Classic In (disc 2) Lost Light Masters Your and Use Grenades Honor Found Robbin' Your (disc The Illusion 2) Hood II Para los Air Arboles

Grain de sable Blind Melon Music Bank (disc 3)

[https://philogb.github.io/jit/static/v20/Jit/Examples/Treemap/example1.html]

D. Koop, DSC 530, Spring 2017 5 Treemap Strip Layout

Top Albums A Stone Chris Cornell Erykah Sinch Guns N' Roses Jack Johnson Mother ThirteenthPerfectMer TinyTempleCore Lost and Found Mama'sBadu Clearing the Use Your Illusion II Brushfire Fairytales MotherLove StepCircleDe Music...Pilots Gun Channel BoneLove NomsSongs On And On Bone From 2006-09-07: O-Bar, Stockholm, Sweden the Incubus Vatican Red Hot Chili Peppers Gift Light Morning Tool One Hot Minute Shop Grenades View 10,000 Days

Sublime Robbin' The Hood

Best Of

Soda Stereo Alice in Chains Comfort y Tryo Music Bank Music Music Music Música Para Mamagubida Reggae à Coup Grain de Bank (disc Bank (disc Bank Volar de Cirque sable 1) 2) (disc 3) Blind Melon Nico Soup Classic Blind Masters Melon

Nine Inch Nails And All That Could Have Been (Still)

Luis Alberto Spinetta Temple of the Dog Elija Y Gane Para los Temple Of The Dog Arboles Krøm Mad Disturbed Superunknown Season Audioslave Air Above The Sickness Audioslave It All Makes Sense Now Down On The Upside In Your Honor (disc 2)

Radiohead Echoes, Silence, Patience & Grace Pearl Jam In Rainbows Vs. Bush Wax Tailor Dave Matthews Band The Science of Things Tales of the Forgotten Melodies Crash [https://philogb.github.io/jit/static/v20/Jit/Examples/Treemap/example1.html]

D. Koop, DSC 530, Spring 2017 6 Treemap Squarify Layout

Top Albums Incubus Krøm Luis Alberto Spinetta Stone Temple Pilots Erykah Badu Make Yourself It All Makes Sense Now Air Elija Y Gane Tiny Music... Songs From Mama's Gun the Vatican Gift Shop

Mer De Noms Core

Para los Arboles Morning View

Guns N' Roses Red Hot Chili Peppers Foo Fighters Jack Johnson Tryo Chinese Democracy One Hot Minute Echoes, In Your On And On Brushfire Mamagubida Grain de Silence, Honor (disc Fairytales sable Patience & 2) Grace Blind Melon Nico Soup

Reggae à Coup de Cirque Use Your Illusion II

Soda Stereo Sinch Comfort y Música Para And All That Could Have Clearing the Channel Volar Been (Still) Chris Cornell Tool 2006-09-07: O-Bar, Classic Masters Blind Melon 10,000 Days Stockholm, Sweden

Audioslave Wax Tailor Pearl Jam Audioslave Tales of the Vs. Mother Love Bone Forgotten Lost and Found Mother Love Bone Melodies Alice in Chains Music Bank Music Bank (disc 1) Mad Season Soundgarden Sublime Above Disturbed Dave Down On The Superunknown Best Of The Sickness MatthewsCrash Upside Band Temple of the Dog Temple Of The Dog Music Bank (disc 2) Music Bank (disc 3) Radiohead Bush In Rainbows Robbin' The Hood The Science of Things

[https://philogb.github.io/jit/static/v20/Jit/Examples/Treemap/example1.html]

D. Koop, DSC 530, Spring 2017 7 Squarified Treemaps and Cushion Treemaps

(a) File system (b) Organization

Fig. 5. Squarified treemaps

(a) File system (b) Organization

Fig. 6. Squarified cushion treemaps [Brus et al., 1999] figure 7(a). This method has some disadvantages. Extra screen-space is used, and fur- D. Koop, DSC 530, Springthermore, 2017 it gives rise to maze-like images, which can be puzzling for the viewer. 8 However, the second disadvantage can be remedied in a similar way as for the visual- ization of thenodes.We fill in the borderswith grey-shades,based on a simple geometric model (figure 8). The width in pixels of a borderof level ,with is given by:

where is the width of the root level border, and a factor that can be used to decrease the width for lower level borders. For the profile of the border we use a parabola:

with Project Proposal • Due Tuesday, March 21 • Identify dataset - Potential data sources: https://github.com/caesar0301/awesome-public-datasets • Understand domain • Decide on tasks • Start brainstorming on visualization and interaction design

D. Koop, DSC 530, Spring 2017 9 Geographic Data • Spatial data • Cartography: the science of drawing maps - Lots of history and well-established procedures - May also have non-spatial attributes associated with items - Thematic cartography: integrate these non-spatial attributes (e.g. population, life expectancy, etc.) • Goals: - Respect cartographic principles - Understand data with geographic references with the visualization principles

D. Koop, DSC 530, Spring 2017 10 Locate

D. Koop, DSC 530, Spring 2017 11 Adding Data • Discrete: a value is associated with a specific position - Size - Color Hue - Charts • Continuous: each spatial position has a value (fields) - Heatmap - Isolines

D. Koop, DSC 530, Spring 2017 12 Discrete Categorical Attribute: Shape

[Acadia NP, National Park Service]

D. Koop, DSC 530, Spring 2017 13 Discrete Categorical Attribute: Shape

[Acadia NP, National Park Service]

D. Koop, DSC 530, Spring 2017 13 Discrete Quantitative Attribute: Color Saturation

D. Koop, DSC 530, Spring 2017 14 Time as the attribute

[http://www.nytimes.com/interactive/2011/03/11/world/asia/maps-of-earthquake-and-tsunami-damage-in-japan.html]

D. Koop, DSC 530, Spring 2017 15 Isolines

[USGS via Wikipedia]

D. Koop, DSC 530, Spring 2017 16 Isolines • Scalar fields: - value at each location - sampled on grids • Isolines use derived data from the scalar field - Interpret field as representing continuous values - Derived data is geometry: new lines that represent the same attribute value • Scalability: dozens of levels • Other encodings?

D. Koop, DSC 530, Spring 2017 17 Map Projection

[P. Foresman, Wikimedia]

D. Koop, DSC 530, Spring 2017 18 Flattening the Sphere?

[USGS Map Projections]

D. Koop, DSC 530, Spring 2017 19 Lambert Conformal Conic Projection

[USGS Map Projections]

D. Koop, DSC 530, Spring 2017 20 Map Projections

[http://xkcd.com/977/]

D. Koop, DSC 530, Spring 2017 21 Choropleth (Two Hues)

[M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 22 Area Marks and Color Hue & Saturation

D. Koop, DSC 530, Spring 2017 23 Choropleth Map • Data: geographic geometry data & one quantitative attribute per region • Tasks: trends, patterns, comparisons • How: area marks from given geometry, color hue/saturation/ luminance • Scalability: thousands of regions

• Design choices: - Colormap - Region boundaries (level of summarization)

D. Koop, DSC 530, Spring 2017 24 Choropleth (Two Hues)

[M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 25 Problem?

2QRWNCT8QVG

1DCOC 68 million

/E%CKP 59 million

[M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 26 Problem?

2QRWNCT8QVG

1DCOC 68 million

/E%CKP 59 million

#OQWPVQHTGFCPFDNWGUJQYPQPOCR

2 1DCOC 850,000 mi

2 /E%CKP 2,150,000 mi

[M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 26 Adding Saturation

[M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 27 Two Variables

By Population Density

COUNTY KERRY WON BY . . . BUSH Urban Suburban Rural Unpopulated*

*Areas with less 6JKUOCRTGOQXGUOQUVN[WPKPJCDKVGF than three people CTGCUTGXGCNKPI/T$WUJœUUWDWTDCP per square mile. CPFTWTCNUWRRQTVKPVJG'CUVCPF5QWVJ [M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 28 Size Encoding

[M. Ericson, New York Times]

D. Koop, DSC 530, Spring 2017 29 Cartograms

[Election Results by Population, M. Newman, 2012]

D. Koop, DSC 530, Spring 2017 30 Cartograms • Data: geographic geometry data & two quantitative attributes per region (one part-of-whole) • Derived data: new geometry derived from the part-of-whole attribute • Tasks: trends, patterns, comparisons, part-of-whole • How: area marks from derived geometry, color hue/saturation/ luminance • Scalability: thousands of regions

• Design choices: - Colormap - Geometric deformation

D. Koop, DSC 530, Spring 2017 31 Rectangular Cartogram

[New York Times]

D. Koop, DSC 530, Spring 2017 32 Non-Contiguous Cartogram

[M. Bostock, 2012]

D. Koop, DSC 530, Spring 2017 33 World Cartograms

[M. Newman, 2009]

D. Koop, DSC 530, Spring 2017 34 World Population

[M. Newman, 2009]

D. Koop, DSC 530, Spring 2017 35 World Energy Consumption

[M. Newman, 2009]

D. Koop, DSC 530, Spring 2017 36 House Races: More Geographic Data?

[New York Times, 2010]

D. Koop, DSC 530, Spring 2017 37 House Races: Maps Aren't Always Best

-->

[NYTimes]

D. Koop, DSC 530, Spring 2017 38 D3 Map Example

• http://codepen.io/dakoop/pen/MpjGGN • Load GeoJSON data via d3.json - Uses a callback - Bind data using datum • Projections are required to map latitude and longitude to a two- dimensional visualization - Albers USA vs. Massachusetts State Plane projection - d3.geoPath does work of projecting points for a path - Use projection as a function to do projection manually

D. Koop, CIS 468, Spring 2017 39