DataTweener: A Demonstration of a Tweening Engine for Incremental Visualization of Data Transforms

Meraj Khan§ Larry Xu; Arnab Nandi§ Joseph M. Hellerstein; The Ohio State University§ University of California, Berkeley; {khan.485,nandi.9}@osu.edu, {larry.xu,hellerstein}@berkeley.edu

ABSTRACT Qnew With the development and advancement of new data inter- action modalities, data exploration and analysis has become UI a highly interactive process situating the user in a session of successive queries. With rapidly changing results, it be- Qnew Qold Rold T comes difficult for the end user to fully comprehend trans- formations, especially the transforms corresponding to com- DataTweener Tweening Engine plex queries. We introduce “data tweening” as an informa- tive way of visualizing structural data transforms, presenting Qnew Rnew the users with a series of incremental visual representations of a resultset transformation. We present transformations DB as ordered sequences of basic structural transforms and vi- sual cues. The sequences are generated using an automated Figure 1: System design for a tweening-based query in- framework which utilizes differences between the consecutive terface. resultsets and queries in a query session. We evaluate the effectiveness of tweening as a visualization method through a user study. users get the right context and make the transformation easier to understand. This change can be brought about by 1. INTRODUCTION a single text query on traditional query interfaces or through a touch gesture on modern direct-manipulation touch-based Data analysis tasks often involve queries which produce interfaces [2, 3, 6]. Direct-manipulation interfaces require large changes in results for relatively small changes in the continuous visual feedback during user interaction to keep queries. Consider the case of the pivot transformation, a the user engaged [7]. Hence, for these modern interfaces, very popular and useful transformation for analytical pur- tweening is a hard requirement. poses. A small change in the query, e.g., changing from the We provide a formalized a visual grammar of basic struc- query Q1 to Q2 causes a large change in the resultset, T1 to tural transforms and visual cues in our work Data Tween- T12 as seen in Figure 3. ing [4], wherein we also presented a framework to generate Q1 = SELECT Id, Year, Dept ordered sequences of micro-operations from this grammar to FROM StudentEnrollment E; encode resultset transformations. The visual grammar is de- signed considering the design principles of animated graph- Q2 = SELECT * FROM ( ics [1, 9] and other works in perception research [8]. We SELECT Year, Dept visualize these ordered sequences as animated transforms FROM StudentEnrollment E) on the query interface described here. PIVOT (COUNT(Dept) FOR Dept IN ("ECE", "CSE")); Such large jumps in the result space can leave the users 2. SYSTEM disoriented as they can clearly notice something changed, but it is difficult to understand what the exact changes were The core of the data tweening based query interface is the and how they were caused. In such cases, tweening (an tweening engine incorporated in the client. When a user incremental visualization of the transformation) can help issues a new query Qnew, the UI passes the query to the tweening engine, which then saves the old query Qold and the old resultset Rold, and issues Qnew to the database for exe- cution. Upon receiving the result Rnew from the database, the tweening engine utilizes the old and new queries and This work is licensed under the Creative Commons Attribution- corresponding resultsets to generate the tweening sequence NonCommercial-NoDerivatives 4.0 International License. To view a copy T . This tweening sequence is then visualized on the UI as an of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For animated transition from Rold to Rnew. This is illustrated any use beyond those covered by this license, obtain permission by emailing in Figure 1. [email protected]. Proceedings of the VLDB Endowment, Vol. 10, No. 12 A tweening sequence is composed of basic transforms and Copyright 2017 VLDB Endowment 2150-8097/17/08. visual cues defined on table entities (list of table cells, columns,

1953 Col1 Col2 Col3 Col1 Col2 Col3 Col1 Col2 Col3 Col1 Col2 Col3 Col1 Col2 Col3 deemphasize([r insert([{r :[B1, A1 A2 A3 3 A1 A2 A3 3 A1 A2 A3 C1 C2 C3 C1 C2 C3 ]) B2, B3]}, order([r , nest([(r B1 B2 B3 1 A1 A2 A3 4 A1 A2 A3 C1 C2 C3 C1 C2 C3 {r5:[B1, B2, r4, r2, r5, c1, r5c1), B3]}]) C1 C2 C3 r3]) B1 B2 B3 (r4c3, B2 r c )]) B1 B3 B1 B2 B3 B1 B2 B3 5 3 B2

Col1 Col2 Col3 Col1 Col2 Col3 Col1 Col2 Col3 C1 C2 C3 C1 C2 C3 C1 C2 C3 Col1 Col2 Col3 delete([r , Col1 Col2 Col3 separate( unnest([{r4c1: emphasize([ 3 A1 A2 A3 A1 A2 A3 A1 A2 A3 [c ]) (2, 1)}]) r3, r4, r5], r4 , r5]) 1 C1 C2 C3 C1 C2 C3 B2 B1 B2 delete) B1 B2 B1 B3 B3 B3

B2 B1 B2 B1 B2

C2 Col2 rotate([{cell- Col1 Col1 C2 Col2 Col1 C2 Col2 Col1 Col2 Col3 bridge([c ])

group:[r1c2, rotate([{text:[ 1 Col3 C3 r c , r c , r c , r1c3, r2c2, r2c3 1 2 1 3 2 2 C1 C3 Col3 C1 C3 Col3 C1 C2 C3 ]}]) C1 r2c3 ]}])

Figure 2: Micro-operations in the Visual Grammar or rows). Figure 2 shows the effect of these micro-operations 3. QUERY INTERFACE on a table. We present a web-based query interface with two interac- The tweening sequence is generated using a fixed rulebase tion modalities – text and multitouch on two independent defined in Table 1. The rulebase defines the ordered se- screens. On the text interface, the user can type in the de- quences of basic transforms corresponding to changes in dif- sired SQL query in a text console, and see the result visual- ferent SQL clauses. If two consecutive queries differ in mul- ized through tweening in the same window. The multitouch tiple clauses, the tweening sequences for respective changes interface defines intuitive touch gestures for basic relational are concatenated to form the complete tweening sequence. data transforms- select, project, aggregate, and cross-tab. In addition to the structural transforms, the sequences in- Each of these transform visualizations are composed of basic clude visual cues which are inserted before / after certain transforms and visual cues described in the earlier section. transforms for providing hints to upcoming and current trans- We restrict the type of queries that can be issued in a query forms in the sequence. The standard visual cues are inserted session to a subset of SQL. We require the FROM clause in into the tweening sequence as per the following rules. all the queries for a given session to be exactly the same. delete is preceded by emphasize, and followed by deempha- 3.1 SQL Console Interface size on the table entities on which delete is to be applied. The users can type in a query in the console at the bottom insert is followed by emphasize, and then deemphasize on right of the browser window. When the user types in a query, the newly inserted table entities. the visualization area is updated from the old resultset to the new resultset through an animated transition. The users order is preceded by annotate showing a note with sorting start by issuing a SELECT query on one of the datasets column, and order (ascending or descending). available in the database. After the first query, the user needs to make sure they use the same FROM clause as that More difficult query changes like pivot are tweened through in the first query in all the subsequent queries. An exception a different set of rules as shown in Figure 3. The tweening to this rule is the pivot query which is required to have the of pivot operation follows the natural logical construct of table being pivoted in the FROM clause expression of the pivot, first showing the aggregation and then following it other queries in the session. Figure 4 shows the console area with the cross-tabulation. In certain cases where at least and the visualization area for this interface. The result for one of the queries from a consecutive pair of queries in a the query typed in the console is shown in the visualization session is an aggregation query, the tweening generation be- area. The query execution is triggered by completion of the comes non-trivial as the tweening engine would need tuples SQL statement in the console. If the user issues an invalid or constituting aggregation of either or both the queries to gen- unsupported SQL statement, an appropriate error message erate a meaningful animation. In such cases, instead of is- is shown on the top right corner of the screen. suing the new query Qnew to the database, the tweening en- gine substitutes” it with a simplified form of the expression- 3.2 Multitouch Interface disaggpQoldq disaggpQnewq, where disaggpQq represents On the multitouch interface, the users can select a database the SQL query which would return the constituent tuples table at the start of the session. The users can then issue forming the resultset corresponding to query Q. Our work queries on the selected dataset through touch gestures. For in [4] details how and why this works. It also covers a few the demonstration, we provide support for basic relational optimizations over these standard techniques which we do operations along with cross-tabulation. The gestures for the not implement in this demo. respective operations are described below:

1954 Table 1: Query-based tweening rulebase Differing Clause Change Tweening Sequence SELECT delete(columns), insert(columns), order(columns) WHERE delete(rows), insert(rows), order(rows) order(rows), highlight(rows, cause=aggregation), in- GROUP BY New Group By Clause sert(aggregateColumn), separate(groups), nest(rows), bridge(groups) GROUP BY Removed Group By Clause delete(aggregateColumn), unnest(rows)

Id Year Dept Id Year Dept Year Dept Table Visualization Area Debug Info Id Year Dept Done Running Query AlbumId Title ArtistId Done Parsing Schema 1 2012 CSE Done Loading DB 1 2012 CSE 1 2012 CSE 2012 CSE Fetching DB 4 2012 CSE 1 Balls to the wall 2 4 2012 CSE 4 2012 CSE 2012 CSE 2 2012 ECE 2 2 2 2012 ECE 2 2012 ECE 2012 ECE 3 2012 ECE 3 Let there be Rock 1 3 2012 ECE 3 2012 ECE 2012 ECE 5 2012 ECE 4 Big Ones 3 5 2012 ECE 5 2012 ECE 2012 ECE 6 2013 CSE 5 Jagged Little Pill 4 10 2013 CSE 6 Facelift 5 SQL Console 6 2013 CSE 6 2013 CSE 2013 CSE 11 2013 CSE 7 Warner 25 Anos 6 SELECT .AlbumId, 10 2013 CSE 10 2013 CSE 2013 CSE 8 Audioslave 8 Album.Title, Album.ArtistId 12 2013 CSE FROM Album; 11 2013 CSE 11 2013 CSE 2013 CSE 7 2013 ECE 9 Out of Exile 8 12 2013 CSE 12 2013 CSE 2013 CSE 8 2013 ECE Figure 4: SQL console query interface 7 2013 ECE 7 2013 ECE 2013 ECE 9 2013 ECE 8 2013 ECE 8 2013 ECE 2013 ECE T1 9 2013 ECE 9 2013 ECE 2013 ECE SELECTION - The query interface provides two mecha- T4 T2 T3 nisms for selection. Users can either pinch-in (Figure 5) rows to delete them or they can specify an attribute- Year Dept Count Year Dept Count Year Dept Count specific WHERE clause predicate in the pop-up dialog opened by tapping on a column header. The predicates 2012 CSE 2012 CSE 2012 CSE 2 2 2 are defined in the same syntax as that in standard SQL. 2012 ECE 3 2012 CSE 2012 CSE For deleting rows using the pinch-in gesture, users need 2012 ECE 2012 ECE to tap on two cells in the same column separated by at 2012 ECE 3 2012 ECE 3 2013 CSE 4 least one row and swipe inwards. This would cause all the 2012 ECE 2012 ECE 2013 ECE 3 rows in between the two touchpoints (excluding the rows T7 containing them) to be deleted. 2013 CSE 2013 CSE Year Dept Count 2013 CSE 2013 CSE 4 4 PROJECTION - The users can delete a column with the 2013 CSE 2013 CSE 2012 CSE 2 pinch-in touch gesture (Figure 5). This works exactly like 2013 CSE 2013 CSE 2012 ECE 3 the pinch-in gesture for selection. The users can select 2013 ECE 2013 ECE two distinct cells in the same row and swipe inwards to 2013 ECE 3 2013 ECE 3 delete the block of columns between the two points. 2013 CSE 4 2013 ECE 2013 ECE 2013 ECE 3 T5 T6 AGGREGATION - The users can initiate the aggregation T8 gesture by a single finger swipe directed from the measure column to the group-by column after which they would be prompted to pick the aggregation function. The swipe Year Year ECE CSE gesture for aggregation and the final output for a sample

Year table is shown in Figure 6. CSE ECE 2012 3 2

ECE CSE 2 2012 3 2012 CROSS-TABULATION - A cross-tab can be performed 3 2 2013 3 4 by an arc-shaped swipe gesture originating at a cell in the column to be transposed to the attribute headers and di-

T11 CSE ECE ECE CSE rected upwards towards the table headers while holding 2013

3 4 Year ECE CSE down a cell in the pivot column. The gesture is depicted 4 2013 3 2012 3 2 in Figure 7. To maintain a continuous visual feedback, T10 2013 3 4 once the system figures out the cross-tab query during the T9 swipe gesture, it starts showing the transition to the new T12 resultset. For example, for the cross-tab query demon- strated in Figure 7, an animated transition from T7 to Figure 3: Incremental Visualization of Pivot Transfor- T12 in Figure 3 is played. mation introduced in Section 1

1955 12 Title AlbumId Title ArtistId AlbumId ArtistId Clarity Understandability 1 Balls to the wall 2 1 Balls to the wall 2 10 2 Restless and Wild 2 Restless and Wild 2 2 8 3 Let there be Rock 1 3 Let there be Rock 1 6 4 Big Ones 3 Big Ones

4 3 Ra.ng 5 Jagged Little Pill 4 5 Jagged Little Pill 4 4

Figure 5: Pinch-in gesture for SELECTION and PRO- 2 JECTION on multitouch interface 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 User AlbumId ArtistId Pick Aggregate Function Figure 8: User rating for pivot tweening contrast to the standard sql resultset visualization where the 1 1 COUNT MAX MIN AVG changes in resultsets are abrupt and discontinuous. For ex- 2 1 ample, the pivot transformation corresponding to a change in query Q1 to Q2 introduced in Section 1 would be displayed 3 2 ArtistId Count as an animated transition stepping through the frames shown 4 3 1 2 in Figure 3. The same transformation on the standard query interface would be displayed as an abrupt change from T1 5 4 2 1 to T12. A demo video (http://go.osu.edu/datatweener) 6 5 3 1 is provided for elucidation. 7 6 4 1 6. CONCLUSION AND FUTURE WORK 8 8 5 1 This work describes an initial implementation of a query interface with support for data-tweening. Users can get real- 9 8 6 1 time feedback in the form of animated transition for a subset 8 2 of SQL queries. We have shown that the visual grammar has Figure 6: Aggregation on multitouch interface the expressivity to encode a wider SQL subset than sup- ported by the current implementation. In the future, we plan to build a full-fledged query interface with tweening EnrollmentYear Dept Count EnrollmentYear CSE ECE support, and add playback capability for all actions. We 2012 CSE 2 2012 2 3 also plan a comparative study to see if users perform any better on data analysis tasks with a tweening enabled query 2012 ECE 3 2013 4 3 interface over traditional query interfaces. Another interest- 2013 CSE 4 ing future work is extending tweening to very large datasets. 2013 ECE 3 Acknowledgements We acknowledge the support of the Figure 7: Cross-tabulation on multitouch interface U.S. National Science Foundation under awards IIS-1422977, IIS-1527779, CAREER IIS-1453582, and IIS-1564351. 4. USER STUDY 7. REFERENCES To study the effectiveness of tweening as a visualization [1] Heer, Jeffrey and Robertson, George G. Animated method and its role in helping people understand complex Transitions in Statistical Data Graphics. TVCG, pages transformations, we conducted a user study [5]. We re- 1240–1247, 2007. cruited 20 undergraduate and graduate students from dif- [2] Idreos, Stratos and Liarou, Erietta. dbTouch: Analytics ferent backgrounds to participate in an in-persosn study. at your Fingertips. CIDR, 2013. The users were asked to rate our pivot tweening sequence [3] Jiang, Lilong and Nandi, Arnab. Designing interactive for clarity and understandability on a scale of 0 ¡ 10. We query interfaces to teach database systems in the also asked them if the tweening helped them understand the classroom. SIGCHI, pages 1479–1482, 2015. pivot transformation better. Figure 8 shows the rating for [4] Khan, Meraj and Xu, Larry, and Nandi, Arnab and pivot tweening by each user. 18 out of the 20 users said they Hellerstein, Joseph M. Data tweening: Incremental found tweening helpful in understanding pivot. The users visualization of data transforms. VLDB, pages 661–672, rated the tweening sequence high both for clarity (mean - 2017. 8.23, standard error - 1.24) and understandability (mean - [5] Kosara, Robert and others. Thoughts on User Studies: 8.48, standard error - 1.31) Through this study, we conclude Why, How, and When? CGA, pages 20–25, 2003. users find tweening helpful in understanding complex trans- [6] Nandi, Arnab and Jiang, Lilong and Mandel, Michael. formations and hence is an effective visualization method. Gestural Query Specification. VLDB, pages 289–300, 2013. 5. DEMONSTRATION [7] Shneiderman, Ben. Direct Manipulation: A Step For the demonstration, we will present a web app on a Beyond Programming Languages. Sparks of innovation touch-screen laptop. The users can opt to either use the in HCI, page 17, 1993. console-based interface or the multitouch interface. The [8] Todorovic, Dejan. Gestalt Principles. Scholarpedia, users can select a publicly available datasets hosted on an 2008. in-browser sqlite database. Users can see complex data [9] Tversky, Barbara and Morrison, and Julie Bauer and transformations broken down and visualized into simple and Betrancourt, Mireille. Animation: Can It Facilitate? smooth transitions between distinct resultsets. This is in IJHCS, pages 247–262, 2002.

1956