Data Visualization (CIS/DSC 468)

Maps

Dr. David Koop

D. Koop, CIS 468, Spring 2017 Quantifying the Space-Efficiency of 2D Graphical Representations of Trees

Michael J. McGuffin and Jean-Marc Robert

Abstract— A mathematical evaluation and comparison of the space-efficiency of various 2D graphical representations of tree struc- tures is presented. As part of the evaluation, a novel metric called the mean area exponent is introduced that quantifies the distribution of area across nodes in a tree representation, and that can be applied to a broad range of different representations of trees. Several representations are analyzed and compared by calculating their mean area exponent as well as the area they allocate to nodes and labels. Our analysis inspires a set of design guidelines as well as a few novel tree representations that are also presented. Index Terms—Tree visualization, graph drawing, efficiency metrics.

1INTRODUCTION A variety of graphical representations are available for depicting tree One basic metric of space-efficiency is the total area of a representa- structures (Figure 1), from “classical” node-link diagrams [23, 7], to tion. Assuming the representation is bound within a 1 1 square, both × treemaps [14, 26, 6, 30], concentric circles [2, 27, 31], and many others icicle diagrams and treemaps (Figures 1C and 1G) have a total area of (see [13] for a survey). A major consideration when designing, eval- 1, and are equally efficient (and both optimal) according to this met- uating, or comparing such representations is how efficiently they use ric. Likewise, concentric circles and nested circles (Figures 1E and 1F) screen space to show information about the tree. To date, however, it is both have a total area of π/4 0.785 (the area of a circle of diameter ≈ unclear how to go about evaluating space-efficiency in a way that can 1), and are also equally efficient according to the metric of total area. be applied to the large variety of tree representations and that enables a However, experience suggests that the representations within each of fair comparison of them. Space-efficiency might be described in terms these pairs do not scale equally well with larger, deeper trees. This of area, aspect ratio, label size, or other measures. However, there is no article shows that there are finer ways of distinguishing efficiency, i.e. accepted standard set of metrics for evaluating the space-efficiency of that there is more to space-efficiency than total area. tree representations, and it is unclear what approach would be general Treemaps are often described as optimally space-efficient, not just Tree Visualizationsenough to be applied to all the forms in Figure 1. because they have a total area of 1, but also because they allow for what we call a weighted partitioning of the area. Nodes can be allo- cated more or less area, depending on some attribute such as file size, population, or number of species, and furthermore this weighted par- titioning can be done without reducing the total area used. These are indeed desirable properties, however they are not unique to treemaps. Figure 2 shows that icicle diagrams also allow for a weighted parti- tioning of area, and incidentally have no need for margins between the borders of nodes as treemaps often do. Furthermore, although a weighted partitioning is useful for showing the relative sizes of nodes in Figures 2A and 2C, an unfortunate side effect is that labels on small nodes are very difficult to read. If users are more interested in seeing the identity of all nodes rather than their relative sizes, an alternative approach would be to give equal weight to each leaf node (Figures 2B and 2D), improving the overall legibility of nodes. (Although not shown in the figure, the labels could also be augmented to numerically show the “size” attribute of each node.) In terms of label size or legibility, Figures 2B and 2D are clearly prefer- able, but even they still result in much whitespace around certain la- bels, suggesting that a more space-efficient (in terms of label size) representation might be possible. Clearly, it would be useful to have some way to quantitatively dis- [McGuffin and Robert,tinguish 2010 the four] possibilities in Figure 2, e.g. in terms of their respec- tive scalability and the sizes of their labels. If total area is the only D. Koop, CIS 468,Fig. Spring 1. Several 2017 basic kinds of tree representations, here each showing metric of space-efficiency2 used, and “optimal” space-efficiency is de- acomplete3-arytreeofdepth3asanexample.Allrepresentations fined as a total area of 1 (possibly partitioned by weight), then we are drawn to just fit within a 1 1unitsquare. A:classical(layered) × have no way of distinguishing these four cases. If alternative metrics node-link [23, 7]. B:avariationonA,wheretheshapeofnodesbetter of space-efficiency are used, such as those investigated in this article, accommodates long labels. C:icicle. D:radial[10,9]. E:concentric it is not clear initially if treemaps, or any other representation, will still circles [2, 27, 31]. F:nestedcircles,similarto[5,28]. G:treemap[14, turn out to be optimal with respect to such alternative metrics. 26]. H:indentedoutline,sometimescalleda“treelist”,andcommon in file browsers such as Microsoft Explorer. This article identifies several metrics related to space-efficiency, and performs the first rigorous analysis and comparison of the space- efficiency of most of the basic tree representation styles in the infor- mation visualization literature, including all those in Figure 1. Some • Michael J. McGuffin is with Ecole´ de technologie superieure,´ Montreal,´ of the key ideas involved are (1) the use of a metric of the size of the Canada, E-mail: michael.mcguffi[email protected]. smallest nodes (i.e. the leaf nodes) in the representation, in addition to • Jean-Marc Robert is with Ecole´ de technologie superieure,´ Montreal,´ a metric of total area; (2) analyzing the area of labels on the nodes, Canada, E-mail: [email protected]. which implicitly takes into account both the size and aspect ratio of the nodes, measuring how much “useful” area they contain; and (3) analyzing how these metrics behave asymptotically, as the tree grows Treemap

[A. Cox and H. Fairfield, NYTimes, 2012]

D. Koop, CIS 468, Spring 2017 3 Treemap Layout Algorithms • Slice and Dice - Alternate direction as splits occur • Strip - Order rectangles and move to a new row when aspect ratio gets worse • Squarified - Split in both directions to get best aspect ratio

D. Koop, CIS 468, Spring 2017 4 Treemap Slice and Dice Layout

Top A MadStone BushFoo Luis Alice in Tool Chris DisturbedErykahAudioslaveSodaSinchDavePearlKrøm TempleNine Tryo SublimeRed Guns WaxRadioheadSoundgardenBlind Melon Incubus Jack Mother ThirteenthPerfect AboveSeasonTempleTiny TheEchoes,FightersAlbertoElija Y MusicChains Bank 10,000 Cornell2006-TheMama'sBadu AudioslaveComfortStereoClearingCrashMatthewsVs.Jam It All Templeof InchAndMamagubidaBest OneHot ChineseN' TailorTalesIn Down Nico Make JohnsonOn MotherLove CircleStep Music...Pilots ScienceSilence,SpinettaGane Days 09- SicknessGun y the Band Makes theOf NailsAll Of ChiliHot DemocracyRoses of RainbowsOn The Yourself And BoneLove Songs ofPatience 07: MúsicaChannel Sense DogThe That MinutePeppers the Upside On Bone From Things& O- Para Now Dog Could Forgotten the Grace Bar, Volar Have Melodies Vatican Stockholm, Been Gift Sweden (Still) Shop

Music Bank (disc 1) Soup

Reggae à Coup de Cirque

Mer De Core Brushfire Noms Music Bank Fairytales Classic In (disc 2) Lost Light Masters Your and Use Grenades Honor Found Robbin' Your (disc The Illusion 2) Hood II Para los Air Arboles

Grain de sable Music Bank (disc 3)

[https://philogb.github.io/jit/static/v20/Jit/Examples/Treemap/example1.html]

D. Koop, CIS 468, Spring 2017 5 Treemap Strip Layout

Top Albums A Stone Erykah Sinch Guns N' Roses Jack Johnson Mother ThirteenthPerfectMer TinyTempleCore Lost and Found Mama'sBadu Clearing the II Brushfire Fairytales MotherLove StepCircleDe Music...Pilots Gun Channel BoneLove NomsSongs On And On Bone From 2006-09-07: O-Bar, Stockholm, Sweden the Incubus Vatican Red Hot Chili Peppers Gift Light Morning Tool One Hot Minute Shop Grenades View 10,000 Days

Sublime Robbin' The Hood

Best Of

Soda Stereo Comfort y Tryo Music Bank Music Music Music Música Para Mamagubida Reggae à Coup Grain de Bank (disc Bank (disc Bank Volar de Cirque sable 1) 2) (disc 3) Blind Melon Nico Soup Classic Blind Masters Melon

Nine Inch Nails And All That Could Have Been (Still)

Luis Alberto Spinetta Elija Y Gane Para los Temple Of The Dog Arboles Krøm Mad Disturbed Superunknown Season Air Above The Sickness Audioslave It All Makes Sense Now Down On The Upside In Your Honor (disc 2)

Radiohead Echoes, Silence, Patience & Grace In Rainbows Vs. Bush Wax Tailor Dave Matthews Band The Science of Things Tales of the Forgotten Melodies Crash [https://philogb.github.io/jit/static/v20/Jit/Examples/Treemap/example1.html]

D. Koop, CIS 468, Spring 2017 6 Treemap Squarify Layout

Top Albums Incubus Krøm Luis Alberto Spinetta Stone Temple Pilots Erykah Badu Make Yourself It All Makes Sense Now Air Elija Y Gane Tiny Music... Songs From Mama's Gun the Vatican Gift Shop

Mer De Noms Core

Para los Arboles Morning View

Guns N' Roses Red Hot Chili Peppers Foo Fighters Jack Johnson Tryo Chinese Democracy One Hot Minute Echoes, In Your On And On Brushfire Mamagubida Grain de Silence, Honor (disc Fairytales sable Patience & 2) Grace Blind Melon Nico Soup

Reggae à Coup de Cirque Use Your Illusion II

Soda Stereo Sinch Comfort y Música Para And All That Could Have Clearing the Channel Volar Been (Still) Chris Cornell Tool 2006-09-07: O-Bar, Classic Masters Blind Melon 10,000 Days Stockholm, Sweden

Audioslave Wax Tailor Pearl Jam Audioslave Tales of the Vs. Mother Love Bone Forgotten Lost and Found Mother Love Bone Melodies Alice in Chains Music Bank Music Bank (disc 1) Mad Season Soundgarden Sublime Above Disturbed Dave Down On The Superunknown Best Of The Sickness MatthewsCrash Upside Band Temple of the Dog Temple Of The Dog Music Bank (disc 2) Music Bank (disc 3) Radiohead Bush In Rainbows Robbin' The Hood The Science of Things

[https://philogb.github.io/jit/static/v20/Jit/Examples/Treemap/example1.html]

D. Koop, CIS 468, Spring 2017 7 Geographic Data • Spatial data • Cartography: the science of drawing maps - Lots of history and well-established procedures - May also have non-spatial attributes associated with items - Thematic cartography: integrate these non-spatial attributes (e.g. population, life expectancy, etc.) • Goals: - Respect cartographic principles - Understand data with geographic references with the visualization principles

D. Koop, CIS 468, Spring 2017 8 Lookup

D. Koop, CIS 468, Spring 2017 9 Locate

D. Koop, CIS 468, Spring 2017 10 Discrete Categorical Attribute: Shape

[Acadia NP, National Park Service]

D. Koop, CIS 468, Spring 2017 11 Discrete Categorical Attribute: Shape

[Acadia NP, National Park Service]

D. Koop, CIS 468, Spring 2017 11 Discrete Quantitative Attribute: Color Saturation

D. Koop, CIS 468, Spring 2017 12 Continuous Data

[http://www.nytimes.com/interactive/2011/03/11/world/asia/maps-of-earthquake-and-tsunami-damage-in-japan.html]

D. Koop, CIS 468, Spring 2017 13 Isolines

[USGS via Wikipedia]

D. Koop, CIS 468, Spring 2017 14 Isolines • Scalar fields: - value at each location - sampled on grids • Isolines use derived data from the scalar field - Interpret field as representing continuous values - Derived data is geometry: new lines that represent the same attribute value • Scalability: dozens of levels • Other encodings?

D. Koop, CIS 468, Spring 2017 15 Map Projection

[P. Foresman, Wikimedia]

D. Koop, CIS 468, Spring 2017 16 Flattening the Sphere?

[USGS Map Projections]

D. Koop, CIS 468, Spring 2017 17 Lambert Conformal Conic Projection

[USGS Map Projections]

D. Koop, CIS 468, Spring 2017 18 Map Projections

[http://xkcd.com/977/]

D. Koop, CIS 468, Spring 2017 19 Choropleth (Two Hues)

[M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 20 Area Marks and Color Hue & Saturation

D. Koop, CIS 468, Spring 2017 21 Choropleth Map • Data: geographic geometry data & one quantitative attribute per region • Tasks: trends, patterns, comparisons • How: area marks from given geometry, color hue/saturation/ luminance • Scalability: thousands of regions

• Design choices: - Colormap - Region boundaries (level of summarization)

D. Koop, CIS 468, Spring 2017 22 Choropleth (Two Hues)

[M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 23 Problem?

2QRWNCT8QVG

1DCOC 68 million

/E%CKP 59 million

[M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 24 Problem?

2QRWNCT8QVG

1DCOC 68 million

/E%CKP 59 million

#OQWPVQHTGFCPFDNWGUJQYPQPOCR

2 1DCOC 850,000 mi

2 /E%CKP 2,150,000 mi

[M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 24 Adding Saturation

[M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 25 Two Variables

By Population Density

COUNTY KERRY WON BY . . . BUSH Urban Suburban Rural Unpopulated*

*Areas with less 6JKUOCRTGOQXGUOQUVN[WPKPJCDKVGF than three people CTGCUTGXGCNKPI/T$WUJœUUWDWTDCP per square mile. CPFTWTCNUWRRQTVKPVJG'CUVCPF5QWVJ [M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 26 Size Encoding

[M. Ericson, New York Times]

D. Koop, CIS 468, Spring 2017 27 Cartograms

[Election Results by Population, M. Newman, 2012]

D. Koop, CIS 468, Spring 2017 28 Cartograms • Data: geographic geometry data & two quantitative attributes per region (one part-of-whole) • Derived data: new geometry derived from the part-of-whole attribute • Tasks: trends, patterns, comparisons, part-of-whole • How: area marks from derived geometry, color hue/saturation/ luminance • Scalability: thousands of regions

• Design choices: - Colormap - Geometric deformation

D. Koop, CIS 468, Spring 2017 29 Rectangular Cartogram

[New York Times]

D. Koop, CIS 468, Spring 2017 30 Non-Contiguous Cartogram

[M. Bostock, 2012]

D. Koop, CIS 468, Spring 2017 31 World Cartograms

[M. Newman, 2009]

D. Koop, CIS 468, Spring 2017 32 World Population

[M. Newman, 2009]

D. Koop, CIS 468, Spring 2017 33 World Energy Consumption

[M. Newman, 2009]

D. Koop, CIS 468, Spring 2017 34 House Races: More Geographic Data?

[New York Times, 2010]

D. Koop, CIS 468, Spring 2017 35 House Races: Maps Aren't Always Best

[NYTimes]

D. Koop, CIS 468, Spring 2017 36