Arc Length-Based Aspect Ratio Selection

Justin Talbot, John Gerth, and Pat Hanrahan equal y=1/x unequal

Arc length AWO MS GOR LOR Arc length AWO MS GOR LOR (a) (b)

Fig. 1. Comparison of aspect ratio selection algorithms. On the left, only AWO and our new arc length-based method are (produce the same aspect ratio) under changes to the parameterization of a curve. On the right, only our arc-length method preserves the natural symmetry of the curve y = 1/x.

Abstract—The aspect ratio of a plot has a dramatic impact on our ability to perceive trends and patterns in the data. Previous approaches for automatically selecting the aspect ratio have been based on adjusting the orientations or of the line segments in the plot. In contrast, we recommend a simple, effective method for selecting the aspect ratio: minimize the arc length of the data curve while keeping the of the plot constant. The approach is parameterization invariant, robust to a wide range of inputs, preserves visual symmetries in the data, and is a compromise between previously proposed techniques. Further, we demonstrate that it can be effectively used to select the aspect ratio of contour plots. We believe arc length should become the default aspect ratio selection method. Index Terms—Aspect ratio selection, Banking to 45 degrees, Orientation resolution.

1 INTRODUCTION Our ability to perceive trends and patterns in a given visual display timization criterion for choosing the aspect ratio: minimize the arc of data is heavily influenced by the aspect ratio. It affects densities, length of the plotted curve while keeping the area of the plot con- relative distances, and orientations within the plot—all important per- stant. This approach is parameterization invariant, preserves symme- ceptual features that impact plot interpretation. tries, and produces reasonable aspect ratios on a wide range of inputs, For data visualization tools, an important practical question is how while being fast to evaluate. Furthermore, it can easily be used with to automatically select the aspect ratio for a given plot. Cleveland et general curves in 2D, which we demonstrate by selecting good aspect al. proposed two methods, median (MS) [4] and length weighted ratios for contour plots. We believe that the properties of the arc length average orientation (AWO) [3], both based on centering the orien- approach are a qualitative improvement over the previous approaches tations of a line plot’s constituent segments around 45 degrees, an and that it should become the default aspect ratio selection method. approach that they call banking to 45°. More recently, Heer and The remainder of the paper is divided into four sections. In Sec- Agrawala [7] proposed two alternative methods, global orientation tion 2, we review the previous work in aspect ratio selection. Next, resolution (GOR) and local orientation resolution (LOR), based on we describe the new arc length method and discuss its connections to maximizing the orthogonality of pairs of line segments. previous methods. In Section 4, we validate the arc length approach All four proposed methods have shortcomings. Three are not pa- by demonstrating that it selects better aspect ratios on both a range of rameterization invariant, meaning that the semantically unimportant simple algebraic curves and on time series data sets; by showing that it way a curve is approximated by line segments can dramatically change can generalize by selecting aspect ratios for 2D contour plots; and by the selected aspect ratio (Figure 1a). None preserve semantically im- showing that it is computationally feasible. Finally, we conclude with portant symmetric shapes (Figure 1b). Most can produce extreme as- discussion of the algorithm and the need for a concrete theory tying pect ratios on relatively simple inputs (Figure 3). And all have been aspect ratio selection to perceptual principles. designed to work with curves that are functions of x, limiting their broader use. 2 PREVIOUS WORK To address these problems, we recommend a simple and robust op- The aspect ratio question was first treated rigorously by Cleveland, McGill, and McGill [4]. Noting that changing the aspect ratio of a plot changes the perceived of lines in a plot, they hypothesized that • Justin Talbot is with Stanford University, E-mail: [email protected]. aspect ratios which maximized the orientation resolution, the • John Gerth is with Stanford University, E-mail: [email protected]. between the line segments, would minimize error in the estimation of • Pat Hanrahan is with Stanford University, E-mail: the ratio of the slopes of those line segments. A designed experiment [email protected]. displaying pairs of line segments supported this hypothesis. Manuscript received 31 March 2011; accepted 1 August 2011; posted online They went on to show that in the case of two line segments in the 23 October 2011; mailed on 14 October 2011. first quadrant, the angle between them can be maximized (and thus For information on obtaining reprints of this article, please send slope ratio error minimized) by selecting the aspect ratio that centers email to: [email protected]. the two line segments around 45°. This led them to suggest that the aspect ratio of plots with more than two segments could be chosen by placing the median segment slope at 1, an approach they called median 3.1 Criteria absolute slope (MS). In developing a new aspect ratio selection algorithm, we were guided Cleveland later suggested an alternative method, length-weighted by the following design criteria suggested by our experience with the average orientation (AWO), that sets the length-weighted average of methods in the previous work. the absolute segment orientations (the angle made with the horizon- tal) at 45° [3]. The switch from slope to orientation was motivated by • Scale invariant. Changes to the input scale of one or both axes the fact that our perceptual processes are sensitive to orientation, not should not change the resulting aspect ratio. slope. The length-weighting ensures that subdividing line segments will not change the resulting aspect ratio, providing some of • Parameterization invariant. The displayed curve is naturally robustness to changes in the input representation. Across a wide range approximated by a set of line segments. Changes to this approx- of inputs, AWO results in flatter aspect ratios than MS; Cleveland as- imation that do not drastically alter the visual form of the curve serts that these are more visually pleasing aspect ratios [2]. should not impact the aspect ratio. In particular, the method Later, Heer and Agrawala [7] proposed selecting the aspect ratio should not assume that the values are equally spaced on the x by maximizing the sum of squares of the angles between all pairs of axis. segments in the plot. They called this the global orientation resolu- tion (GOR) method and, unlike the previous methods, it has the nice • Robust. Corner cases, such as horizontal, vertical, or collinear property of being derived from Cleveland’s hypothesized perceptual segments, should not require special handling. The addition of a justification. In practice, this approach gives very similar results to an small amount of noise to the curve should not change the aspect unweighted average orientation approach at the cost of a very expen- ratio. sive optimization function. A computationally cheaper approach is to only consider the orientation resolution between adjacent pairs of seg- • Symmetric. The most reasonable aspect ratio for a curve sym- ments. Heer and Agrawala called this approach the local orientation metric around y = x is one that preserves that symmetry (e.g. resolution (LOR). Both approaches have trouble dealing with perfectly Figure 1b). (This is the weakest of our criteria. However, in the horizontal and vertical segments, so they must be removed first. “For absence of any perceptual results to the contrary, a symmetric completeness,” Heer and Agrawala also suggest a third method, set- banking is the most parsimonious result.) ting the average slope (AS) to be 1, an approach closely related to our • Fast to compute. To be useful in interactive visualization tools, arc length method. the method must be relatively fast. We would like the computa- Guha and Cleveland [5] recently suggested the resultant vector tion cost to be at most O(n) in the number of line segments in method which has a simple, tractable algebraic form that they employ the plot. to study theoretical properties of aspect ratio selection. Geometrically, the plot’s line segments are treated as vectors and positioned in the 3.2 Algorithm first quadrant by reflection around the x and y axes. The sum of these is the plot’s “resultant vector”. The transformation which banks it to The input to our algorithm is the set of line segments for which we 45 degrees determines the plot’s aspect ratio. Mathematically, this want to compute an aspect ratio for display. The arc length method simplifies to computing the ratio of the segments’ total variation in the consists of simply minimizing the total length of the line segments 1 y and x directions. With this formulation, Guha and Cleveland have under the constraint that the area of the plot is preserved. been able to derive geometric properties for the aspect ratios of select We enforce the area-preserving constraint by not looking for an as- families of curves and demonstrate the statistical convergence of the pect ratio directly. Instead, we search for a transformation that adjusts aspect ratio as random noise is added to the data. the plot’s aspect ratio while preserving area. Such transformations are Our arc length method arises from an independent analysis and geo- called hyperbolic rotations or squeeze mappings and can be written in metric interpretation of the aspect ratio selection problem. But we can form as: 1 a 0 show that the arc length method can be interpreted as a generalization /√ 0 a of the resultant vector method. Because of this, many of the arc length  √  method’s empirical advantages demonstrated in this paper are shared (This definition follows Cleveland, who defined aspect ratio as by the resultant vector method as well. height/width. Heer and Agrawala use the, perhaps more common, Visualization systems have long supported interactive aspect ratio width/height definition.) We can confirm that these transformations changes (at least since Buja’s Data Viewer [1]), a powerful and un- preserve area since the determinant of the matrix is always derutilized analytic tool. Automatic aspect ratio selection methods are complementary to such interaction, providing reasonable defaults, en- √a 0 = 1 abling automated plot generation, and helping to develop insight into √a − the relationship between aspect ratio and human perception. Finally, Heer and Agrawala [7], in the second part of their paper, Given a, we can multiply the coordinates of the input line segments by suggest using frequency space analysis to decompose a time series the corresponding squeeze mapping to produce line segments appro- into multiple components each of which can be banked independently. priate for display. Since the decomposition problem is independent of the aspect ratio We select a such that the total length of the line segments is mini- selection problem, in this paper we assume that, if necessary, the in- mized. To do this we solve the optimization problem, put has already been processed using Heer and Agrawala’s method or N another statistical decomposition technique. ∆xi min ∑ ,√a∆yi (1) a (0,∞) || √a || ∈ i=1 3 ARC LENGTH-BASED ASPECT RATIO SELECTION where ∆x and ∆y are the lengths of the ith line segments in the x and y In this section we propose a new aspect ratio selection method: mini- i i directions and x is the Euclidean length. Note that the lengths of the mize the arc length of the plotted curve while keeping the area of the line segments will|| || be scaled by the area-preserving aspect ratio trans- plot constant. We first briefly describe desirable properties we would formation. Due to the square root in the Euclidean length measure, this like the algorithm to have. We then describe the arc length method in optimization problem does not have a closed form solution; however, more detail. Finally, we compare it to previous algorithms and show how it can be derived from the optimization of orientation resolution 1Constraining the area simplifies the algorithm, but isn’t strictly necessary. by following reasoning similar to Heer and Agrawala’s GOR and LOR Arc length scales linearly with the square root of the plot area, so we could also methods. minimize the arc length divided by the square root of the area. where vi is the ith line segment represented as a vector. Next, within the family of area-preserving aspect ratio changes, the cross product is not a function of a and can be ignored in the optimization:

√a √a vi(a) v j(a)= xiy j x jyi = xiy j x jyi × √a − √a − Finally, the denominator can be easily turned into the square of the total arc length.

2 ∑∑li(a)l j(a)= ∑li(a)∑l j(a)=(∑li(a)) i j i j i Fig. 2. For a straight line, the minimum length is achieved at 45 degrees. For an , the minimum arc-length is achieved when banked to a Thus, orientation resolution (as expressed by Equation (2)) can be circle. maximized by minimizing the total arc length. 3.4 Connection to Resultant Vector and Average Slope it has a unique minimum and can be easily and quickly solved using We have presented the arc length method using the Euclidean metric to a variety of methods. In practice, we parameterize the optimization measure length. A potentially useful property of the Euclidean metric search with log(a), rather than with a directly. This makes the cost is that it is rotationally invariant. However, there may be reasons to function smooth and symmetrical around the minimum and removes prefer other metrics. For example, if we replace the Euclidean metric the need to deal with the lower bound on a. with the Manhattan metric, we can get a closed form solution for a: We can gain some insight into the behavior of our proposed method by considering simple cases (Figure 2). If we stretch or squish a rect- N ∆x ∑N ∆y i ∆ i=1 i angle bounding a diagonal line while keeping the contained area the min ∑ + √a yi a = N | | (3) a (0,∞) | √a | | | ⇒ ∑ ∆xi same, the line will reach a minimal length when the rectangle is square. ∈ i=1 i=1 | | Thus, a single diagonal line will always be banked to 45 degrees. Sim- This is exactly equivalent to the resultant vector method proposed by ilarly, consider a collection of line segments approximating an ellipse. Guha and Cleveland [5]. Thus, another geometric interpretation of the Our method will bank this to a circle since the ellipse with minimal resultant vector method is that it minimizes the Manhattan length of circumference enclosing a fixed area is a circle. the curve. Further, in the special case of a time series with equally 3.3 Connection to Orientation Resolution spaced data points, this Manhattan metric approach is also equivalent to Heer and Agrawala’s AS method. The arc length metric is closely connected to Cleveland et al.’s orienta- tion resolution perception hypothesis which assumes that, to make the 4 RESULTS ratios of segment slopes easily visible, the angle between the segments To demonstrate the effectiveness of the arc length method, we first should be made as large as possible. compare the methods on a set of simple algebraic curves and then on Heer and Agrawala’s GOR method [7] attempts to directly maxi- a variety of time series data sets. The former serves to emphasize the mize the orientation resolution by finding the aspect ratio, a, that max- robustness and symmetry of the arc length method. The latter shows imizes the sum of the squares of the larger angles between all pairs of that arc length is a consistent compromise method, producing aspect line segments: ratios that fall within the range of previous methods. Next, we show 2 max θi j(a) a ∑∑ that the arc length method is also effective at selecting aspect ratios for i j general 2D curves. Finally, we discuss the performance of the method. Following similar reasoning, we propose an alternate criterion for 4.1 Method Comparison maximizing orientation resolution: Algebraic Curves. In Figure 3, we compare arc length to the four ∑ ∑ sin(θi j(a)) li(a)l j(a) methods recommended in the previous work on a variety of simple max i j | | (2) a ∑i ∑ j li(a)l j(a) algebraic curves. The curves were sampled evenly along the x-axis to make the comparison to the non-length-weighted methods as fair where θi j(a) is the angle made between the two line segments i and as possible. For the GOR and LOR methods, we discarded horizontal j, and li and l j are the lengths of the ith and jth line segments, all of and vertical segments as recommended by Heer and Agrawala. which vary as functions of a. AWO is the most similar to arc length, though arc length’s aspect In comparison to Heer and Agrawala’s approach, our formulation ratios are slightly taller with higher orientation resolution. This is has two primary differences. First, our formulation includes weighting promising since Cleveland found AWO to generally produce superior by the lengths of the line segments. This makes the measure invariant aspect ratios. The quarter circle example, in the fifth row, demon- to parameterization changes, addressing a shortcoming of GOR. Sec- strates that arc length can preserve symmetry, whereas AWO produces 2 ond, we use sin(θi j) instead of θ . Both functions reach a maximum a segment of an ellipse. As the scale parameter of the gamma (Γ) | | i j when lines are at right angles to each other; but our choice is the same distribution increases, arc length maintains roughly the same aspect for both the larger and smaller angles made between line segments i ratio. In contrast, AWO gets flatter, biased by the large nearly flat re- and j; so, unlike GOR, we do not need to specify which is selected. gion. The last two rows show where the results of arc length and AWO The behavior of this criterion is easy to describe. It will be 0 if all diverge most noticeably. AWO, GOR, and LOR all scale down the nar- the lines are parallel and reaches a maximum at 1 if all the lines are at row triangle case to maintain 45° lines. Heer and Agrawala argue that right angles to each other. The latter case is, in practice, impossible, this behavior is preferred, since it maintains ideal slopes. However, in since we include the terms for each line segment paired with itself and the limit, as the triangle’s width approaches 0, this behavior will se- this pairing obviously cannot be at right angles. However, the intuition lect a completely flat aspect ratio, obscuring the presence of the spike holds that the closer this number is to 1, the more line segments are entirely. Arc length’s behavior is different; it selects nearly the same orthogonal. aspect ratio regardless of the triangle’s width. While this deviates from Now we show that Equation (2) can be simplified to the arc length Cleveland’s 45° recommendation, it ensures that the spike will remain criteria. First, the numerator can be restated as a cross product since visible regardless of how narrow it is. MS and GOR are susceptible to producing abnormally tall aspect sin(θi j(a)) li(a)l j(a)= vi(a) v j(a) ratios when there are nearly flat regions of the curve. In the gamma | | | × | e flnsfligwti h uln hehl,mkn h resulting the making threshold, the in culling invariant. changes scale the to not within thus, algorithm and, falling slope lines input of the in set changes to lead will axis vr tsntovosi hstrsodcnb aesaeinvariant. scale made be can threshold this seg- na How- if a line horizontal. With obvious also and not vertical but of it’s threshold segments, ever, some vertical within are and that horizontal ments just not culling selection. ratio cannot aspect We robust idiosyncratic. for quite use is its LOR recommend of behavior the Finally, ments. a omda lp,wieLR(imn)i oevariable. more is (diamond) simi- LOR quite while often slope, is median (square) than to GOR lar ratios (triangle). method aspect slope time flatter median length-weighted natural produces the the and consistently that find artificial (circle) we of method Agrawala, AWO set and large Heer Like a [11]. for series heuristics banking the by stesaeprmtricess diinly oetedegenerate the note the Additionally, on GOR ratio increases. of aspect result parameter 0-width a shape to the degenerate as methods both cases, distribution dealing when degeneracies po performs have LOR board. but segments. the sel vertical circle cases, or length quarter horizontal many arc nearly the in with but for well similar, ratio work are aspect GOR AWO symmetric and pleasing length more Arc sa evenly x-axis. curves algebraic the simple of recommended selection by a produced for ratios methods aspect of Comparison 3. Fig.

ieSeries. Time narrow wide quarter tmyb osbet mrv h eairo O rLRby LOR or GOR of behavior the improve to possible be may It triangle triangle Γ(2, 128) Γ(2, 16) Γ(2, 8) gaussian sin(x) circle x4 x Arc length ¨ v hehligapoc,cagst h nu cl fan of scale input the to changes approach, thresholding ıve iue4sostelgo h setrtochosen ratio aspect the of log the shows 4 Figure y AWO = x ltdet h oiert falteseg- the all of colinearity the to due plot MS GOR pe along mpled ryacross orly Sand MS . LOR banking csa ects uhtle setrto.I oecss sin produce as somewhat cases, generally produces some methods In length other ratios. arc aspect The taller curves, AWO. much than algebraic ratios the aspect with taller As 4. scale, ure log a On ratio. aspect geometric range. their compromise the to natural corresponds a of ratios mean, center aspect two the length between near midpoint arc the the falls Further, often data. that ratio series bankings aspect time produce encourag- in to is trends known This highlight are effectively methods. techniques previous previous the the since of ing range the within falls that (cir AWO between methods. ratios other (diamond) aspect the sets: LOR compromise and data selects method (triangle), series length MS arc time (square), select GOR for met (cross), previous length length by arc produced ratios to aspect ative of Comparison 4. Fig. sovosytotl.I tes uhas such others, In tall. too obviously is npeiu ok akn a enol ple oposweethe where plots to applied of only function been a is has curve banking work, previous 2D In in Curves data, the 4.2 in trend interest. frequency of lower cycles a frequency higher highlight the to obscuring is ratio aspect taller nemasta r eghde o eur vnsmln ftecurve the of ad- sampling even two invari- require has parametric not method’s does approach length length arc length arc that the arc means First, ance The task. 2D. this in in vantages curves general to plied dowjones computer bankdata shampoo ukdeaths adv_sale wagesuk condmilk elecnew iue5sosaslcino h iesre umrzdi Fig- in summarized series time the of selection a shows 5 Figure ratio aspect an picks consistently (cross) heuristic length arc The housing hsales2 cpi_mel plastics ustreas bricksq wnoise deaths boston motion pollutn capital hsales writing schizo 9−17b labour sheep airline bicoal huron beer2 prodc motel fancy 9−11 9−10 9−13 9−12 ibm2 jcars mink dole pigs elec lynx cow milk 9−4 9−3 9−9 dj x oee,bnighuitc a lob ap- be also can heuristics banking However, . log(relative aspectratio) dole and 9-13 fancy and h euto the of result the , jcars W and AWO h result the osrel- hods l) arc cle), The . n,snei ae uvssmercaround symmetric curves Sec makes curves. it 2D since for ond, parameterization inconvenient an x-axis, the along o patterns. details frequency the high obscure and but low data, the the between in trend frequency low the uv otedt.Uigteaclnt ehdo h OS curve LOESS the on method length arc the Using data. the to curve ihteagbaccre,aclnt ssmlrt W,btp ( but cases AWO, some to in similar the ratios is of aspect length some arc tall show curves, which 4 algebraic Figure the from with series time Select 5. Fig. r hw eldt e fteeuainlvladfriiyrt of rate In fertility that [12]. assumed and 1888 have level we about education plot, in top the Switzerland the of of set provinces data French-speaking real the a shows 7 ure consistently sets. produces data these which For on LOR ratios shown ratios. aspect not poor aspect have flat we extremely similar reasons, gaus- producing results space bivariate sometimes producing the and sometimes pro- as AWO, mixed, but such to is similar, cases behavior GOR’s simple very very is sian. AWO for the even are ratios. think duces aspect we what pleasing MS. producing as visually circles same most to the circles essentially bank behaves methods length Both arc plots, contour these in length, equal as nearly methods of non-length-weighted possible. segments the as to line fair comparison in the results contour making extract again process we then This and density the lines. of model nonparametric a fit to R’s use We up make method. that the segments aspect to line an input the select as using to contours simply method the by length plots arc contour the is for use common ratio can most We the plot. perhaps contour 2D, the in curves general featuring alization ilb akdt circles. to banked be will otu Plots. Contour osehwbnigcnorposdfesfo akn uvs Fig- curves, banking from differs plots contour banking how see To AWO, to similar was length arc where case, curve simple the Unlike sets. data real of number a on approach this demonstrates 6 Figure prodc jcars fancy dowjones dole capital 9−13 kde2d Arc length 1]gi-ae enldniyetmto routine estimation density kernel grid-based [16] hl ubro lttpsaiei aavisu- data in arise types plot of number a While 9-13 y safnto of function a is and jcars y AWO .I the In ). = x fpsil,circles possible, if x n taLOESS a fit and dole h ihrfeunypten.Aclnt n W oabette a do AWO and length Arc patterns. frequency higher the f oue eeal alrapc ais h te ehd c methods other The ratios. aspect taller generally roduces and ags ifrne nslce setrto costere the across ratios aspect selected in differences largest fancy - iesre,M,GR n O eettl setrto htemph that ratios aspect tall select LOR and GOR, MS, series, time ait omldsrbto ftemi lse fpit n visually and points of bi- cluster roughly outliers. main the the highlights den- the emphasizes that the of of ratio distribution lines aspect normal contour an variate the select joint on we their method estimate, seeing length sity in arc interested the are Using we density. thus, and independent, are ables qez apn transformation. mapping squeeze where o oprsn ntebto lt easm htthe that assume outliers. we the plot, of bottom impact the the in reduces be- comparison, and relationship For negative education, the and emphasizes fertility that tween ratio aspect an in results emn fe oainby after segment piiaint nld oain wihaeas rapeevn)by preserving) area also criterion: the optimization are in the (which consider modifying we rotations transformations include of to set optimization the expand can we example, dimensionality of form axes. arbitrary some axes created when the has when as reduction happen meaningful, may inherently This su not useful. transformations be are generally other might 2D that skews in or transformation curves rotations as only general for the But are sense. makes changes ratio aspect then a MS h setrto and, ratio, aspect the , te transformations. Other gi,ti a aiyb oeuigteaclnt esr.For measure. length arc the using done be easily can this Again, R ( x i , θ ) and min a , R θ ( y i ∑ = N i , 1 θ θ || θ h oainage nigtebs rotated best the finding angle, rotation the , ) GOR R hs h piiainwl eoe both over be will optimization the Thus, . r h n opnnso the of components y and x the are ( ∆ ftepotdcrei ucinof function a is curve plotted the If √ x a i , θ ) , √ aR ( ∆ omne ehd.As methods. commended niu opoueoverly produce to ontinue y i , θ o fcompromising of job r ) || LOR x and i y hline th asize vari- ch x , iuefse.Ee o h eaierslssol erlal.Bt the Both reliable. the be and should methods mag- results form relative of closed the orders so, be Even likely faster. would Im- nitude language performance. compiled about a conclusions in broad plementations overly draw to hesitate we require not does faster. it solutions times form since 10-20 closed further LOR The a and function. length were AWO transcendental Arc a as of fast evaluation second. the as 1 twice within about segments was 50,000-100,000 bank to able O nodrt nueta twl cl otedsrddt size. data desired the to scale use will to it choosing that when ensure taken to be order should in care GOR more demand- bit A in even conditions. ing applications interactive for feasible computationally mlmnain rte nRadRsbuilt-in R’s and R in written implementations ihnteramo htsmoewudwn opo epcal for well (especially plot second), the to hand, 1 more other want than the would with On someone more series data). what took contour time of so bank realm (doing the to segments within GOR 250 use about to than impractical it found We ssa piiaincieinta is that criterion optimization an uses lsdfr ouin,wihcnb optdin have computed vector) be (resultant can length which arc solutions, Manhattan form and closed AS, charac MS, performance simple relatively teristics. have all methods proposed The Performance 4.3 aspec flat extremely cases, pr some ho its in near but with cause, problems s segments similar, th GOR’s vertical the is ellipses. of AWO roughly in three results are angles circles. and MS flatter to length and circles arc length banking Arc using nicely plots methods. contour proposed Banking viously 6. Fig. eghcnb fcetyotmzduigacieinta is that criterion a using optimized efficiently be can length n stenme fln emnsi h nu.AO O,adarc and LOR, AWO, input. the in segments line of number the is ie h o efrac fa nepee agaesc sR, as such language interpreted an of performance low the Given etse h efrac ftevrosbnigagrtm using algorithms banking various the of performance the tested We trees rock iris old faithful catholic swiss height gaussian Arc length AWO O ( n ) piiainmtosapa obe to appear methods optimization O O ( ( n n 2 ) ) piiainmtoswere methods optimization . MS optimize O ( n ) ie where time, frnefor eference O GOR ( iotlor rizontal ratios. t function. n ) GOR . pre- e ame, - edi o s npeeec otepeiu work. previous recom- the method to length preference in arc use the for of it properties mend empirical these think We ehv rsne naclnt-ae praht setrtoselec properties: ratio following aspect the to has approach which length-based tion arc an presented have We ai eeto ehdrmisa pnpolm nhswr,Cleve- ratio work, aspect for his foundations In theoretical possible problem. two suggested open has an land as remains an appro- of method an “correctness” selection Developing the ratio address can others. that over theory perceptual method priate selection ratio aspect our h ulesfo h ancutro points. of cluster main ra the aspect from Fertil resulting outliers assuming The the estimation variables. density ban independent 2D is are assuming a plot ucation of bottom curve lines The LOESS contour ne variables. the a the two emphasizes using the dependen which between banked Education different lationship is on two dependent with plot is set top Fertility data The same the sumptions. Banking 7. Fig. D 5 oee,w aentpeetdany presented not have we However, ti atrt opt hnAOadsbtnilymr scalable more substantially and AWO than compute to faster is It • ratios aspect compromise produces it curves, series-like time For vertical, • for handling special need not does invariant, scale is It • orienta- maximizing equation an from directly derived be can It • Fertility hnGOR. than GOR banking and MS, ellipses fail. to produces can AWO similarly LOR contrast, and performs In it circles. plots, to circles where contour behave to cases For continue AWO In and length well. arc methods. fail, previous MS and of LOR, GOR, mean geometric the curve. near the of parameterization LOR, the GOR, to invariant to is contrast it in MS, and and segments, colinear or horizontal, orientation. or of slope function to formulation—a reference without parsimonious alone, a by length is proposed ratio it aspect and for Cleveland, criteria perceptual a resolution, tion SUSO AND ISCUSSION 40 90 0

Fertility 40 90 50 0 F UTURE Education Education W ORK perceptual esn oprefer to reasons i highlights tio t n Ed- and ity e using ked aiere- gative yas- cy 50 pect - theory, neither of which appears to be completely satisfactory. REFERENCES The first, orientation resolution [4], was used in this paper (Sec- [1] A. Buja, C. Hurley, and J. A. Mcdonald. Elements of a viewing pipeline tion 3.3) to derive the arc length metric. It was also used by Heer and for data analysis. 1988. Agrawala to justify their GOR and LOR methods. Cleveland’s original [2] W. S. Cleveland. The Elements of Graphing Data. Hobart Press, Summit, banking study [4] showed that orientation resolution was important in New Jersey, U.S.A., 1985, 1994. slope comparisons between pairs of line segments. More recent per- [3] W. S. Cleveland. A model for studying display methods of statistical ceptual results cited by Loffler [10] support the orientation resolution graphics. Journal of Computational and Graphical Statistics, 2(4):323– hypothesis. For example, Heeley and Buchanan-Smith [6] showed that 343, 1993. angle judgments depend principally on the angle size (best near 0, 90, [4] W. S. Cleveland, M. E. McGill, and R. McGill. The shape parameter of 180, and 270 degrees) and are largely rotationally invariant. However, a two-variable graph. Journal of the American Statistical Association, experiments by Snippe and Koenderink [15] appear to show that angle 83(402):289–300, 1988. discrimination may depend on orientation. Further, Kennedy, Orbach [5] S. Guha and W. S. Cleveland. Perceptual, mathematical, and statistical properties of judging functional dependence on visual displays. Technical and Loffler showed that both angle discrimination [8] and angle es- report, Purdue University Department of Statistics, 2011. timation accuracy [9] are significantly better for angles embedded in [6] D. Heeley and H. Buchanan-Smith. Mechanisms specialized for the per- an isosceles rather than scalene triangle. These latter results suggest ception of image geometry. Vision Research, 36(22):3607–3627, 1996. that orientation resolution may be insufficient. Perception of angles is [7] J. Heer and M. Agrawala. Multi-scale banking to 45 degrees. IEEE Trans- influenced by their embedding within a geometric figure, which is not actions on Visualization and Computer Graphics, 12:701–708, 2006. accounted for by the orientation resolution hypothesis. [8] G. J. Kennedy, H. S. Orbach, and G. Loffler. Effects of global shape on The second theoretical foundation proposed by Cleveland is cur- angle discrimination. Vision Research, 46(8-9):1530–1539, 2006. vature [3], or the change in orientation of the curve per unit of arc [9] G. J. Kennedy, H. S. Orbach, and G. Loffler. Global shape versus local length. In the perception literature, Whitaker and McGraw [17] have feature: An angle illusion. Vision Research, 48(11):1281–1289, 2008. tied curvature aspect ratio to better visual discrimination across multi- [10] G. Loffler. Perception of contours and shapes: Low and intermediate ple scales. Curvature is also promising due to its wide use in computer- stage mechanisms. Vision Research, 48(20):2106–2127, 2008. Vision aided geometric design as a criteria in defining “fair” curves [14]. Research Reviews. Cleveland [3] showed that banking to 45° maximizes local curvature. [11] S. G. Makridakis, S. C. Wheelwright, and R. J. Hyndman. Forecasting: Similarly, it is easy to show that minimizing the local arc length max- Methods and Applications. Wiley, New Jersey, third edition, 1998. ISBN imizes local curvature because the numerator of the curvature formula 978-0-471-53233-0. [12] F. Mosteller and J. W. Tukey. Data Analysis and Regression: A Second is invariant to changes in aspect ratio and the denominator is a mono- Course in Statistics. Addison-Wesley, Reading Mass., 1977. tonic transformation of arc length. However, in the process of de- [13] D. Regan and S. Hamstra. Shape discrimination and the judgement of veloping the arc length method, we spent a lot of time attempting to perfect symmetry: Dissociation of shape from size. Vision Research, derive an aspect ratio selection approach directly from curvature with- 32(10):1845–1864, 1992. out success. Our curvature-based methods tended to select extreme [14] N. S. Sapidis, editor. Designing fair curves and surfaces: shape quality aspect ratios which resulted in high curvature at a few locations along in geometric modeling and computer-aided design. 1994. the curve and very low curvature everywhere else, similar to the LOR [15] H. P. Snippe and J. J. Koenderink. Discrimination of geometric angle in results presented in this paper. The problem appears to be that cur- the fronto-parallel plane. Spatial Vision, 8(3):309–228, 1994. vature is a local property of the curve whereas aspect ratio selection [16] W. N. Venables and B. D. Ripley. Modern Applied Statistics with S. requires a global view of the entire curve. Springer, New York, fourth edition, 2002. ISBN 0-387-95457-0. While the arc length method does not immediately suggest a better [17] D. Whitaker and P. V. McGraw. Geometric representation of the perceptual foundation for aspect ratio selection, it does provide three mechanisms underlying human curvature detection. Vision Research, interesting directions for future research. First, and perhaps most sur- 38(24):3843–3848, 1998. prisingly, it demonstrates that effective aspect ratios can be selected without explicit reference to either orientation or angle, suggesting that they may not be strictly necessary elements of a sound aspect ratio theory. Second, the arc length method reveals, somewhat counterintu- itively, that good aspect ratios minimize the visual space taken by the data and maximize white space. This property suggests interesting perceptual hypotheses that might lead to more insight into aspect ratio selection. For example, more white space may reduce our perceptual tendency to filter out high frequencies, thus ensuring that we can see high frequency patterns of interest. Or it may be true that minimum length lines lead to quicker, more accurate, graph reading. Finally, the arc length method emphasizes the importance of symmetry and reg- ularity in displayed curves, properties known to trigger hyper-acuity discrimination for shapes [13]. If this acuity applies to curves it could prove an important asset when the visual analysis task depends on rec- ognizing regularity or detecting small deviations from it. Substantial future work remains to be done to close the gap between perceptual theory and the proposed practical methods for aspect ratio selection. Understanding how the arc length method extends to other plot types, such as 3D plots, may provide insight into this problem. However, well-designed experimental work will likely be necessary to fully understand the perceptual impact of aspect ratio selection.

ACKNOWLEDGMENTS The first author thanks Phil Smith; an early conversation with him di- rectly inspired this work. The first author was supported by FODAVA grant 0937123.