Holistic Data Access Optimization for Analytics Reports∗

ABSTRACT ORM. It also shows that normalized-sets plans out- Object-Relational Mappers (ORMs) enable single language perform denormalized-sets plans by manyfold for the case of access to both the main memory data and the database data analytic reports, which was not the case in semistructured of an application. Unfortunately, they also lead to perfor- queries without aggregation. mance inefficiencies, especially in analytics applications with The Collage query processor has been integrated and de- information-rich reports involving nested and aggregated re- ployed as part of a complete web application development sults over large data volumes. framework, including a templating enegine. Using an online Past database research suggests that a report over a database IDE, one can build a wide class of web applications with can be modeled by a single semi-structured query, i.e. a both the ease of single language access, and performance of query involving nesting and heterogeneity. Thus, the re- holistic optimization. port’s data accesses can be holistically optimized through a single query. Collage resolves important practical challenges 1. INTRODUCTION towards enabling the report-as-query approach and its holis- It has been well-known since the 80s that the impedance tic optimization. In particular, the Collage middleware sys- mismatch between the SQL database and the application tem models a report page as a SQL++ query. SQL++ is layer is a major time waste in application development [8]. A a superset of SQL by two extensions: (a) It provides dis- developer has to use two languages (SQL and an application tributed access to an SQL database and the in-memory ob- , such as Java, Ruby or Python) for jects of the web application programming language. (b) The data access, and tediously convert between relational tables query output has the full generality needed by report tem- and objects. plating engines built around HTML and JSON. Indeed, the Best practices among practitioners to mitigate impedance SQL++ is a superset of JSON. The Collage mismatch are to utilize Object-Relational Mappers (ORMs). implementation features its own report templating engine With ORMs, developers specify mappings from relational but one can easily utilize Collage’s query processor in other tables into classes of the application programming language. HTML and JSON-based templating engines. Consequently, the ORM automatically translates navigation On the query optimization side, Collage expands into an- across objects of these classes into SQL queries, often involv- alytics queries. The following query optimization problems ing joins. With this object navigation paradigm, ORMs offer emerge and are solved. (1) Novel rewritings guarantee the single language access to all data, be them native objects of compilation of any SQL++ query into an efficient set-at- the application or tuples of the database. The pervasive a-time query plan, assuming the objects of the application adoption of ORMs1, such as Hibernate, Active Record of programming language offer object id’s that can be com- Ruby-on-Rails (RoR) and CakePHP is a testimony to the pared and be efficiently grouped. This guarantee includes importance of the impedance mismatch problem, and the analytics queries, i.e., queries that involve aggregation and desire for single language access to database tuples and ap- top-k processing, which were unaddressed by prior works on plication objects. semi-structured queries. (2) Normalized and denormal- While ORMs provide single language unified access for ized set-at-a-time query plans are formulated and evaluated. simple data access patterns, this ease of use often comes The evaluation validates the manyfold speedup over a main- at a performance cost of queries in very common reporting ∗ applications, especially analytics applications that display Supported by NSF III-1018961 and NSF III-1219263, PI’d nested, aggregated data [18, 17]. As a running example, con- by Prof Papakonstantinou who is a shareholder of App2You Inc, which commercializes outcomes of this research. sider the web application in Figure 1 that has a dashboard page for visualizing and analyzing sales data of a TPC-H [22] database. A user uses the checkboxes to select nations to analyze, and the application stores the selections as in- memory objects within the HTTP session of the application Permission to make digital or hard copies of all or part of this work for server. The application then displays a HTML table of se- personal or classroom use is granted without fee provided that copies are lected nations, where each row comprises the nation name not made or distributed for profit or commercial advantage and that copies 1 bear this notice and the full citation on the first page. To copy otherwise, to In this paper, code examples are presented in RoR, since republish, to post on servers or to redistribute to lists, requires prior specific RoR is a widely used , and Active Record, its permission and/or a fee. built-in ORM, supports many mainstream database engines. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. Nevertheless, the discussion applies to other popular ORMs. and the top 3 years of sales revenue. almost always incurs a manyfold performance penalty due to extraneous sequential scans or repeated random accesses, -.$/%-#) as will be demonstrated in our experiments. Worse still, the -.$/%-*+',) performance penalty ratio increases with the amount of data -.&') !%&&'-$) accessed and visualized, since the number of queries issued is dependent on the number of nations. !"#$%&'(#) A sophisticated developer who optimizes for performance will re-implement tuple-at-a-time queries as set-at-a-time !"#$*+',) -.$/%-*('0) queries. Two alternatives demonstrated in this paper are: .11('##) • Denormalized-sets queries: Queries that retrieve re- %(1'(#) sults as denormalized-sets. For example, a single query %(1'(*+',) that retrieves the join of nations and top sales. !"#$*('0) %(1'(*1.$') • Normalized-sets queries: Queries that retrieve results $%$.2*3(/!') as normalized-sets. For example, a query that retrieves Figure 1: Analytics application for TPC- Figure 2: nations, and a separate query that retrieves top sales with corresponding nation ids. H data TPC-H schema An ORM induces developers to decompose data access ! "#$%&'#()*)+,)-#./,0%.#1)'(#23%4#4)4%3'#,%#-/,/5/()#,)4&%3/3'#,/5*)(# into multiple object navigations (translated by the ORM 6 7+,08)9)+%3-::;/()<+%..)+,0%.<)=)+>,)?# into multiple queries) that are interspersed among loops and @ ##A$9B7CB#CBDEF979G#C7;HB#()*)+,)-I./,0%.(#?./,0%.I3)2#JKCBLB9MA#M# N ()*)+,)-I./,0%.(<)/+O#-%#P.P# conditionals of imperative languages. This decomposition Q ##7+,08)9)+%3-::;/()<+%..)+,0%.<)=)+>,)?# R ####SJKTB9C#JKCF#()*)+,)-I./,0%.(?./,0%.I3)2M#U7HVBT#?A"W.XAMS#M# reduces data access performance: Even though each individ- Y ).-# ual query may be optimally executed by the database, the Z "#T,%3)#+%44%.#(>5[)=&3)((0%.#2%3#()*)+,)-#./,0%.(#0.#,)4&%3/3'#,/5*)(# set of queries collectively may not compute the report effi- \ 7+,08)9)+%3-::;/()<+%..)+,0%.<)=)+>,)?]]T^H# ciently. Such inefficiencies can be very large in analytics ap- !_ ##$9B7CB#CBDEF979G#C7;HB#,)4&I./,0%.(#7T# !! ##TBHB$C#.<./,0%.I1)'`#.<./4)# plications producing reports that involve nesting, joins and !6 ##a9FD###./,0%.(#7T#.# aggregation. In particular, this paper focuses on the ineffi- !@ ##bFJK###()*)+,)-I./,0%.(#7T#(#FK#(<./,0%.I3)2#c#.<./,0%.I1)'# !N T^H# ciencies of tuple-at-a-time queries, where each outer context !Q M# tuple causes the execution of an inner query that produces !R "#T>5[d>)3'#,%#3),30)8)#-0(,0.+,#./,0%.#1)'(# nested tuples. !Y d!#c#ATBHB$C#eJTCJK$C#./,0%.I1)'#7T#./,0%.I1)'I6#a9FD#,)4&I./,0%.(A#

!Z "#T>5[d>)3'#,%#3),30)8)#/ff3)f/,)(#).[4/(()# ! "#$$%&'()*+,'%-./%01'023.4'5%$$6% !\ d6#c#73)*::C/5*)<.)g?:+>(,%4)3(M# 7 ",+3*'6% 6_ ##<&3%h)+,?A,!<./,0%.I1)'I6`## 8 %%"9%!"#$%&'"((')"*+',%-.&.-96% 6! ############-/,)I&/3,?iA')/3iA`#%<%3-)3I-/,)M#/(#%3-)3I')/3`## : %%%%"9%$/-0)00$%&120)()*#),3&"#$%&0241&'&"#$%&35)64-77-#89)-#+)&-96% 66 ############TVD?%<,%,/*I&30+)M#7T#(>4I&30+)AM# ; %%%%%%",/6% 6@ ##<23%4?A+>(,%4)3(#/(#+AM# < %%%%%%%%",=6"9>%&'&":)%96"?,=6% 6N ##(,I1)'#c#%<+>(,I3)2#A#j# @ %%%%%%%%",=6% 6Q #########SbFJK#?"Wd!XM#7T#,!#FK#+<./,0%.I3)2#c#,!<./,0%.I1)'I6SM# A %%%%%%%%%%"9%%";;8);"#)0-7-<8,)8- 6R ##&?A,!<./,0%.I1)'I6`## B %%%%%%%%%%%%%%%%'0)()*#=2,"#)3>"8#=?26)"8?2@-%8,)83,"#)A-"0-%8,)836)"8@-- 6Y ##########-/,)I&/3,?iA')/3iA`#%<%3-)3I-/,)MAM# !C %%%%%%%%%%%%%%%%%%%%%%%%%09:=#%#"(3>8$*)A-"0-09:3>8$*)2A- !! %%%%%%%%%%%%%%%%'B%$&0=2*90#%:)82A- 6Z "#T>5[d>)3'#,%#&/3,0,0%.#/ff3)f/,)(#5'#./,0%.#1)'# !7 %%%%%%%%%%%%%%%%'C+)8)=2&"#$%&38)/-7-D2@-&'&"#$%&35)6A- 6\ d@#c#]]T^H# !8 %%%%%%%%%%%%%%%%';8%9>=2,"#)3>"8#=?26)"8?2@-%8,)83,"#)A-"0-%8,)836)"82A- @_ ##TBHB$C#,645)3?M#FUB9?# !: %%%%%%%%%%%%%%%%'%8,)8=209:3>8$*)-EFGH2A- @! #################E79CJCJFK#;G#,6<./,0%.I1)'I6# !; %%%%%%%%%%%%%%%%'($:$#=IA--96% @6 #################F9eB9#;G#,6<(>4I&30+)# !< %%%%%%%%%%"#$$%&'()*+,'%-./%3+/%01+/,%D+E+F0/G),%0.().H'H,%$$6% @@ ###############M#7T#3%gI0-# !@ %%%%%%%%"?,=6% @N ##a9FD#?"Wd6<,%I(d*?MXM#7T#,6# !A %%%%%%"?,/6% @Q T^H# !B %%%%"9%)&,%96% 7C %%"9%)&,%96% @R "#B=)+>,)#d>)3'#,O/,#3),30)8)(#,%&[@#/ff3)f/,)(#0.#)/+O#&/3,0,0%.# 7! "?,+3*'6% @Y ,%&I(/*)(#c#F3-)3# @Z ##<()*)+,?A,@<./,0%.I1)'I6`#,@<%3-)3I')/3`#,@<(>4I&30+)AM# @\ ##<23%4?S?"Wd@XM#/(#,@SM# Figure 3: RoR page with tuple-at-a-time queries N_ ##*,#c#l/(O<.)g#W#PO/(O`#1)'P#O/(Om1)'n#c#W# implements Figure 1, and operates over the TPC-H schema N@ ####:./4)#######co#.0*`# in Figure 2. Data access code in the figure are denoted with NN ####:/ffI%3-)3(#co#mn# NQ X#X# italics. Line 3 uses an each loop with variable n to iterate NR ,%&I(/*)(<)/+O#-%#P,P# through all nations, and Line 4 retains only nations that are NY ####3)(>*,m,<./,0%.I1)'I6n4I&30+)##co#,<(>4I&30+)# Q_ ####X# Within the loop and conditional, Line 6 produces the na- Q! ).-# tion name, whereas Lines 7-17 produce the bar chart. Since Q6 "#b%0.#./,0%.#1)'(#g0,O#./,0%.#./4)(# arbitrarily complex statements can occur within the loop Q@ ./,0%.(#c#7+,08)9)+%3-::;/()# and conditional, RoR does not attempt to holistically opti- QN ##########<()*)+,?A./,0%.I1)'#`#./4)AM<23%4?A,)4&I./,0%.(AM# QQ ./,0%.(<)/+O#-%#P.P# mize multiple data access points into more efficient queries. QR ####3)(>*,m.<./,0%.I1)'n<./4)#c#.<./4)# Instead, it simply executes statements in an imperative, QY ).-# tuple-at-a-time fashion: For each selected Nation object n, Figure 4: RoR page with normalized-sets queries Lines 8-15 are translated into a SQL query that has a sin- gle parameter for the nation key of n. Parameterizing the For example, consider the RoR fragment for normalized- inner queries causes k nations to require k queries, which sets queries in Figure 4, which is manyfold faster than the tuple-at-a-time queries in Figure 3: our experiments show a SQL++ queries are set processable, and Collage utilizes novel 7x speedup on a 3GB TPC-H database. rewritings to holistically optimize naive and inefficient tuple- While ORMs make it easy to write tuple-at-a-time queries, at-a-time plans to efficient set-at-a-time plans (i.e. denormalized- they do not facilitate the significantly more efficient set- sets or normalized-sets plans). To the best of our knowledge, at-a-time queries. A developer has to be a SQL expert these rewritings target the largest class of nested queries to understand that, since the nested query computes ag- among that considered by prior work: They (a) capture ag- gregates and top-k results, retrieving en masse the top 3 gregation and ranking, which are expensive and common in sales years for each nation id requires partitions and win- modern analytics applications, (b) are applicable over multi- dow functions (OVER (PARTITION BY ... ORDER BY ...) ple levels of nesting (c) handle arbitrary types of correlation in Lines 16-40), which is an advanced and esoteric feature between the enclosing query and the nested query. Our ex- in SQL 2003. periments validate that the rewritings achieve a manyfold Worse, the developer is burdened with explicit distributed speedup (e.g. 7x on a 3GB TPC-H database). Further- programming: he has to (a) copy intermediate results across more, larger databases and heavier visualizations result in the memory/database boundary while minimizing communi- even greater speedups. We also observe that set-at-a-time cation costs (Lines 2-7), (b) store common sub-expressions in plans outperform tuple-at-a-time plans by a large margin temporary tables (Lines 9-15), (c) write ad-hoc imperative for heavy queries, and remain competitive for inexpensive code to perform nesting and joins in the application layer ones, which suggests that the rewritings should be applied (Lines 42-51, 53-57) and (d) utilize different APIs based on whenever possible. which source the data resides in (SQL commands in Lines 1- Performance comparison of different optimizations 15, ActiveRecord in Lines 16-40, object APIs in Lines 41-57). Our performance analysis and extensive experiments reveal Not only does this lack of location transparency fall short of that rewriting to normalized-sets plans is highly superior single language access, it may also result in performance in- to denormalized-sets plans, especially for complex analytics efficiencies since neither the programming language nor the queries. The efficiency gains are attributable to both obvi- database can take advantage of optimization opportunities ous reasons, such as data redundancies and communication in the integration code across the two systems. overhead incurred by denormalization, as well as surprising To address these issues, the database community has pro- ones, such as SQL database optimizations enabled through posed many novel solutions that enable holistic optimization the removal of outer-joins. of queries in reporting pages. Systems such as FERRY [18] Lightweight integration of external objects Using and SWITCH [17] compile imperative data access code into Collage as middleware, a developer can easily use SQL++ an algebraic representation, thereafter apply holistic opti- as the single language to access and integrate objects of ex- mization of nested queries. Separately, in declarative web ternal data sources, including SQL databases, JSON data frameworks such as Strudel [11] and Hilda [24], pages are and Java objects. These objects can be defined in a different modeled as nested queries, which is a first step towards holis- programming language and type system (thus going beyond tically optimization of the page. object databases), and in contrast to conventional data in- In the same spirit, we present Collage [2, 3, 4], a declara- tegration systems, the developer does not incur the upfront tive framework for data-driven web applications that provide burden of mapping external classes into SQL++ schemas. novel holistic optimizations of reporting queries in analytics This is enabled by a novel SQL++ feature where query applications. In the current implementation, a developer paths navigate lazily into external objects, and each path declaratively specifies the reported data with a page query, step dynamically bind to the next object reference returned which executes over multiple data sources. The page query by a method call. Collage treats external objects as first- is written in SQL++, which is backwards compatible with class citizens, and extends query processing correspondingly SQL, captures the JSON data model, and easily integrates for heterogeneity and lenient type checking. the application programming language objects that are ac- We have deployed Collage as a complete web application cessed by reports. Furthermore, the page query is distributed framework made available on the cloud. A demo system, over a persistent SQL database and in-memory application- which includes multiple sample queries such as the running level objects of requests and sessions. example of Figure 1, is online at http://ec2-54-244-248-12. In its current implementation, Collage simultaneously pro- us-west-2.compute.amazonaws.com2. Furthermore, Collage vides ease of use for the developer through the SQL++ sin- is currently deployed in real-world applications, including an gle language access, as well as efficient performance through analytics application currently utilized by two pharmaceu- holistic query optimization within its distributed query pro- tical companies. cessor. Thus, data access within analytics applications is This paper is structured as follows. Section 2 illustrates reduced to a special semi-structured and distributed query the syntax and semantics of SQL++ through an example processing problem. Collage application. Sections 3 and 4 present the techni- Systems that compile imperative code into queries (like cal details of Collage, including the SQL++ data model, FERRY and SWITCH) also stand to benefit from using Col- SQL++ language extensions and optimizations for tuple-at- lage’s query processor, as it provides a flexible data model a-time queries. Section 5 compares the performance gains of (both in terms of input and ouput) and novel optimizations Collage’s optimizations. Finally, Section 6 presents related for complex reports that involve aggregation and topk. work in database and programming language research.

1.1 Contributions This paper focuses on the distributed query processor of Collage, which builds upon existing techniques to introduce 2The machine is hosted anonymously in the AWS cloud novel query processing optimizations: service in compliance with SIGMOD’s double-blind require- Optimizing tuple-at-a-time queries We show that all ments. 2. ARCHITECTURE, SYNTAX AND SEMAN- ! "#$%&%'()%*+%,-./%01$*2(343,%)23$15+ 6 ++!"#"$%&&'(')*+,'-./01&'(')2/1&3& TICS 7 ++++++++++++!"#"$%&&&45(4)*/-6)7*380/)781&,(,74/7-4)*/9&:!&,74/7-0/)71& 8 +++++++++++++++++++++!;<3,(*,*)=-67+>/9&:!&?@2-67+>/& Collage provides an efficient template engine [23] for web 9 ++++++++++++ABC<&&&&&45(,74/7?&:!&,1& : ++++++++++++&&&&&&&&&45(>@?*,2/7?&:!&>& frameworks that follow the Model-View-Controller (MVC) ; ++++++++++++DE"B"&&&&,(>@?*-7/F&G&>(>@?*-./0&:HI& architecture pattern, which is practically all web frameworks. < ++++++++++++&&&&&&&&&>(')*+,'-7/F&G&')*+,'-./0++++++ = ++++++++++++JBC;K&LM&45(4)*/-6)7*380/)781&,(,74/7-4)*/9& In MVC web frameworks, an application is modularized into !> ++++++++++++CBI"B&LM&?@2-67+>/& actions (Controllers) and pages (Views). A HTTP request !! ++++++++++++#N*/4-')*+,'?&:!&?1& !8 ++&&&&&&&&45(')*+,'?&:!&'& guage of the web framework (e.g. Ruby for RoR, Java for !9 ++DE"B"&&&?(')*+,'-7/F&G&'(')*+,'-./0&:HI& Spring Web). The action reads and writes the application !: ++&&&&&&&&?(?/=/>*/4&G&*7@/& state, which includes a persistent SQL database and tran- !; "?#$%&%'()%*5+ sient in-memory objects stored in request and session scopes !< "%,@A/5+ != ++"#$%&%'#2-+$2B-C/01?Q,R'-')*+,'?15+ of the application server. An action terminates by choosing a 6> ++++"%-5+ page to be displayed next. To display the chosen page, the 6! ++++++"%D5+E+')2/+F+"?%D5+ 66 ++++++"%D5+ web framework then uses a template engine to evaluate a 67 ++++++++"#B3)%'@,-4C*,-%5+ page template which, in adherence to MVC best practices, 68 ++++++++++"#$%&%'#2-+$2B-C/01)PP7/P)*/?15+ 69 ++++++++++++"C2AB&35+ should simply instantiate HTML markup without causing 6: ++++++++++++++"A,@/A5+E+,74/7-0/)7+F+"?A,@/A5+ 6; ++++++++++++++"G,AB/5+E+?@2-67+>/+F+"?G,AB/5+ side-effects on the application state. Web frameworks allow 6< ++++++++++++"?C2AB&35+ different template engines to be installed as plugins. For ex- 6= ++++++++++"?#$%&%'#2-5+ 7> +++++++"?#B3)%'@,-4C*,-%5+ ample, RoR works with template engines such as ERB (as 7! ++++++"?%D5+ shown in Figure 3), Haml, and Slim. 76 ++++"?%-5+ 3 77 ++"?#$%&%'#2-5+ Collage’s template engine can be used in Java-based MVC 78 "?%,@A/5+ web frameworks, such as Struts and Spring Web. Collage is also available as a standalone web framework, which fur- Figure 6: Collage page that splits queries from markup ther mitigates impedance mismatch in actions by specifying them as PL/SQL statements [4]. However, in this paper we easily specified using only semi-structured distributed queries focus on Collage’s template engine for pages that are ana- and template markup. lytics reports, and do not further discuss the use of Collage As with all template engines, the developer can combine for actions. queries and markup by either inlining queries within the markup (e.g. Figure 5, which is similarly inlined as the RoR ! "#$%&'() page of Figure 3) or modularizing them into two separate * ))"+,#-#.+/0)12'0345) 6 ))))!"#"$%&&&&'(')*+,'-./01&'(')2/& parts (e.g. Figure 6). Typically inlining is preferred for sim- 7 ))))3456&&&&&&7/77+,'(7/8/9*/:-')*+,'7&;!&71& 8 )))))))))))))):<(')*+,'7&;!&'& ple pages, whereas modularizing is for pages with complex 9 ))))=>"4"&&&&&7(')*+,'-?/@&A&'(')*+,'-./0&;BC& data access and visualizations. In Figure 5, the semantics : ))))))))))))))7(7/8/9*/:&A&*?D/& ; ))5() of the fstmt:for statement (Lines 2-8) is that it evaluates < ))))"#0() its query, and instantiates its body (Lines 9-31) for each tu- != ))))))"#>()?)')2/)@)"A#>() !! ))))))"#>() ple in the result. For example, a tr element is output for !* ))))))))"+2BC#.%$0DEF$0#() each nation. Expressions enclosed in “{” and “}” utilize at- !6 ))))))))))"+,#-#.+/0)12'0345) !7 ))))))))))))!"#"$%&&&&:<(:)*/-E)?*FG0/)?G1&,(,?:/?-:)*/H&;!&,?:/?-0/)?1& tributes of the enclosing query. Expressions within HTML !8 ))))))))))))))))))))))!I6F,(*,*)8-E?+9/H&;!&7D2-E?+9/& tags are output as strings (Line 10), whereas those within !9 ))))))))))))3456&&&&&&:<(,?:/?7&;!&,1& !: )))))))))))))))))))))):<(9D7*,2/?7&;!&9& visual unit tags such as funit:bar_chart (Lines 25-26) are !; ))))))))))))=>"4"&&&&&,(9D7*-?/@&A&9(9D7*-./3&;BC& !< ))))))))))))))))))))))9(')*+,'-?/@&A&')*+,'-./0)))))) used as the model of JavaScript components. Collage is *= ))))))))))))J45IK&LM&&:<(:)*/-E)?*FG0/)?G1&,(,?:/?-:)*/H& responsible for efficiently and incrementally rendering both *! ))))))))))))54C"4&LM&&72-DG0CE') ** ))))))))))))#N6N%&&&&&O& HTML and JavaScript components [2]. *6 ))))))))))5() Collage accepts SQL++ queries, which are a superset of *7 ))))))))))))"E/&2-B() *8 ))))))))))))))"&$%'&()?),?:/?-0/)?)@)"A&$%'&() SQL queries and thus familiar to a large audience of SQL de- *9 ))))))))))))))"H$&2'()?)7D2-E?+9/)@)"AH$&2'() *: ))))))))))))"AE/&2-B() velopers. Sections 3 and 4 provide details on these SQL++ *; ))))))))))"A+,#-#.+/0() extensions. Internally within the Collage framework, a page *< ))))))))"A+2BC#.%$0DEF$0#() 6= ))))))"A#>() compiler always compiles the fstmt:for queries of a page 6! ))))"A#0() into a single SQL++ query with nesting, as in Figure 6, so 6* ))"A+,#-#.+/0() 66 "A#$%&'() that Collage can subsequently optimize the SQL++ query by rewriting the initial tuple-at-a-time plan into a more ef- ficient set-at-a-time plan (Section 4.2). Figure 5: Collage page that inlines queries within markup In Collage, page templates (such as Figures 5 and 6) as- 3. DATA MODEL semble the data of the page with SQL++ queries, which are distributed over the database and in-memory objects. The input to the SQL++ queries is essentially a graph The semi-structured query results are rendered via Collage’s of tuples. Formally, an SQL++ object is a pair of an id markup into HTML and JavaScript components, such as and an SQL++ value.A named id can be explicitly written maps, charts, and calendars. Therefore, report pages are in queries. An unnamed id cannot be explicitly written in queries. An SQL++ value is either 3The same fundamental principles can be used to integrate Collage’s template engine with web frameworks of other lan- 1. a typed atomic value, such as the string ’abc’ or the guages such as Ruby or PHP. integer 7. 2. an object reference, which is either the id of an object or fore being able to creating output heterogeneous results even the special constant null. when the input is homogenous. Navigation in nested values and navigation through 3.a tuple [a1 : v1, . . . , an : vn], where the attribute names object graphs In the spirit of XQuery and OQL, the SQL++ a1, . . . , an are strings and each attribute value vi is an SQL++ value. FROM allows clauses where a variable ranges over the collec- tion that is bound to another variable, potentially involving 4.a collection {e1, . . . , en}, where each ei is recursively an navigation through tuples. The navigations of SQL++ can SQL++ value. access nested values and can also access the nesting that is Since SQL++ is an extension of SQL, Collage easily mod- created by object references, without making a distinction els an SQL database as an SQL++ source. Given a stan- between the two. dard SQL relation R, Collage models the SQL database as Location transparency A developer writes a SQL++ query an object identified by the named id R. The value of R is in a location-transparent fashion, even when it spans multi- a collection of tuple values, where the tuples have atomic ple data sources. In the running example, the developer has values. mapped (not shown) the db data source to a SQL database, Objects of programming languages (Java, Javascript, etc) and the session data source to the in-memory HTTP ses- are also easily wrapped into SQL++. In the running ex- sion object of the application server. In Figure 6, database ample, there is a session attribute selected_nations whose tables (Lines 5, 6, 14) and in-memory objects (Line 13) are value is a Java List of (references to) objects that have prefixed by respective data source names. Source-specific a boolean property selected and an integer property na- functions (such as the db.date_part stored procedure, Line 3) tion_ref. Collage models this session attribute and the are similarly prefixed, whereas flexible functions (such as = data accessible via it as an object identified by the name in Lines 7, 13, 14) that can be executed at any data source selected_nations. The value of selected_nations is a are not prefixed. collection of object references to objects that have an un- Non-set results Unlike SQL where queries always output named id and a tuple value, where one of the attributes is tables, SQL++ queries can be simple expressions. For ex- named selected and has boolean values. ample, 1 + 1 outputs a SQL++ scalar. This paper does DAG and cyclic structures are possible by having object not further discuss such queries, since they do not present references that point to the same object. Notice that this set processing opportunities. modeling is virtual, in the sense that no activity will take place until a query needs to reach some of the reachable 4.1 Plans and Algebra Overview data. Therefore, even if a huge amount of Java data are A logical plan p = T ← e ; ... ; T ← e ; e starts with a accessible by navigations that follow object references, none 1 1 n n list of zero or more assignments T ← e , where each T is will be accessed until a query needs to. i i i a temporary result and each e is a SQL++ algebra expres- Notice that SQL++ can also be seen as an extension of i sion that may input the previously computed temporaries JSON. JSON is particularly important because it is readily T ,...,T . The result of p is the SQL++ table resulting consumed by the visualization components utilized in Col- 1 i−1 from the evaluation of the result expression e, which may lage’s template engine and other template engines. input any temporary T1,...,Tn. The algebraic operators involved in SQL++ algebra ex- 4. pressions input and output tables, i.e., collections of ho- The SQL++ query language is backwards-compatible with mogenous tuples. Intuitively, each attribute of a table cor- SQL, featuring extensions analogous to the extensions that responds to a variable of the FROM clause or some value that the SQL++ data model introduced over the SQL data model: is reachable via the variables. Each tuple corresponds to a Nested value output Unlike SQL which restricts the SE- binding of the variables. Therefore we will call them binding LECT clause to only allow sub-queries producing scalars, in attributes and binding tuples respectively. SQL++ the SELECT clause can contain any SQL++ sub- The majority of SQL++ algebra operators are extensions query. This extension enables creating nested values. For of operators well-known from conventional SQL processing example, the running example’s query (Figure 6, Lines 2-16) (e.g., see textbooks [15]). While conventional operators in- creates a collection of tuples, where each tuple corresponds put and output tables where the values of the binding at- to a nation and has an attribute aggregates whose value is tributes are only atomic values, SQL++ extends the opera- itself a collection of tuples. tors to allow the binding attribute values to also be collec- The default SELECT-FROM query creates collections of tu- tions, tuples and object references. The list of standard op- ples - as in SQL. Minor syntactic variations of the SELECT erators comprises cartesian product ×, union ∪, intersection clause enable the creation of other types of collections, such ∩, difference −, selection σc, join ./c, full outer-join ./ c, left as collections of atomic values and collections of collections. outer-join ./c, semi-join >

Translating the FROM clause with Ground, Scan and mally, χG1...Gn;eX operates on an input table that has at- Navigate operators The role of the ground (Ground) and tributes G1 ...Gn. It groups the input tuples into input sub- scan (Scan) operators is to provide the algebraic counterpart collections I1,...,Il according to the group-by attributes of the FROM clause. The ground operator has no input table G1 ...Gn, such that each input sub-collection Ii has an as-

(indeed it is the only operator with no input table) and sociated group-by tuple (G1 : g1i ,...,Gn : gni ). For each always returns {[]}, i.e., a table that has a single tuple with input sub-collection Ii, the operator derives an output sub- no binding attribute. collection Oi by replacing the parameter input X with Ii in The scan operator Scans7→a inputs a table. For each bind- the expression eX. The operator’s result is ing tuple t of the input table, the scan finds the collection ∪i=1,...,l{(g1i , . . . , gni ) × Oi}, i.e., the union of the output 4 vc = {e1, . . . , en} identified by s, where either (a) s is a sub-collections after they are padded with their group-by named object id (e.g., customers, selected_nations), in values. For example: which case vc is the value of the named object, or (b) s is a χnation;Topk2(τsales(X)) nation name sales binding attribute of t and vc is its value. In either case, the nation name sales USA Joe 6 scan outputs binding tuples t#[a : e1], . . . , t#[a : en].  USA Joe 6 China Chen 5 = The navigate operator (Navs.p .....p 7→a) outputs exactly China Fu 8 1 m China Fu 8 China Chen 5 one binding tuple t#[a : v] for each input binding tuple t. China Zhao 4 The value v is non-null, when the binding tuple t has an Functions SQL++ allows functions over input attributes in attribute s whose value is a tuple t1 or an object reference the SELECT, FROM, WHERE, ORDER BY, GROUP BY, JOIN, PARTI- pointing to a tuple t1. In either case, the tuple t1 has an TION and LIMIT clauses. A flexible function can be executed attribute p1 whose value is a tuple t2 with an attribute p2. in both memory and database sources, whereas a source- The navigation continues until a tuple tm is reached and tm specific function can only be executed in one source. For has an attribute pm whose value is v. If there is no such path example, = is a flexible function, whereas db.date_part() of tuples t1, . . . , tm with attributes p1, . . . , pm, the value v is (Figure 6, Line 3) is a database-specific function. null. In the case that the value of s is a collection with a In the final distributed plan, each operator is designated single tuple (or single reference to a tuple), it is treated as to execute in a specific source. To handle the case where if it were a tuple or an object reference. a SELECT, WHERE, ORDER BY or LIMIT clause involves both For notational brevity, this paper also uses the derived op- memory-specific and database-specific functions, the SQL++ erator scan-navigate (ScNa) that combines a scan operator operators π ¯ , σc, τ ¯ and Topk are restricted from naming that produces a binding attribute a, with multiple navigate L F k grid functions in their parameters. Rather, the function operator grid operators that initiate their navigation from a. In particu-system system λf(A1,...,An)→N evaluates a single function f. For each input lar, ScNa 1 1 n n ≡ s7→a;a.p 7→a ,...,a.p 7→a grid tuple t with attributes A1,...,An, the λ operator outputs grid Nava.p17→a1 ... Nava.pn7→an Scans7→a. Furthermore, thesystem out- an output tuple that has all the input attributes, and an ad- system put attribute name can be omitted if it is identical togrid the ditional attribute N that is the result of evaluating f using grid input attribute name. system system the A1,...,An values in t. For example, the SQL++ clause Tuple-at-a-time nesting operator Nested queries aregrid trans- ORDER BY g(X),f(Y,Z) is translated into: grid lated into SQL++ algebra using the apply-plan operatorsystem system τN ,N (λf(Y,Z)→N (λg(X)→N )). grid 1 2 2 1 grid αP →N . The α operator inputs a single table. In general, system system P is a correlated plan, such that attributes of the input ta- grid grid p ! aggregates Plan p ble may appear in P wherever constants or named id’ssystem can ! system appear in uncorrelated queries. Such correlated input at- grid grid tributes are called parameters of P . The α operator outputssystem !nation_key, name ! order_year, sum_price system one tuple t#[N : P/t] for each input tuple t, i.e., each out- grid Topk grid system ! selected = true AND 3 system put tuple has all attributes of the respective input tuple t, nation_ref = nation_key and an additional attribute N for the result of evaluatinggrid ScNa grid system session ! sum_price system the plan P/t, i.e. the non-correlated plan that results from .selected_nations ! s; s.nation_ref, substituting each parameter with its respective bindinggrid in s.selected grid system !order_year; system SUM(total_price) sum_price the tuple t. ScNadb.nations ! n; ! grid n.nation_key, grid Nesting via grouping SQL++ allows aggregate functions n.name system !date_part('year', order_date) system that produce non-scalar results. Of particular interest to the Ground ! order_year grid grid optimizations presented in this paper is the system ! nation_ref = @nation_key AND system cust_ref = cust_key NEST(A1,...,At) → N function whose output is a new tablegrid grid system ScNadb.customers ! c; system N. Formally, the function is evaluated on each input sub- c.cust_key, c.nation_ref grid grid collection I1,...,Il created by γG1,...,Gn;NEST(A1,...,At)→N ac- ScNa system db.orders ! o; system cording to the group-by attributes G1 ...Gn. For each input o.cust_ref, o.order_date, o.total_price grid grid sub-collection Ii, the function NEST performs the projection system Ground system πA1,...,An (Ii) and outputs the table attribute N. grid grid Partition The partition operator χ enables support ofsystem the Figure 7: Initial plan from translation into algebra system SQL 2003 PARTITION feature and window functions. In ad- Figure 7 shows the initial algebraic plan translated from 4In case s identifies a non-collection value v and coercion is the SQL++ query of Figure 6. The Scansession.selected_nations enabled, the case is treated as if s had identified the collec- operator scans an in-memory Java collection object and out- tion {v}. puts a tuple with attributes nation_ref and selected. The grid grid system system

α operator represents the SELECT clausegrid nested query: the !" grid !"#$%!"#%&'%()*% CS ! E X = V !…! X = V correlated plan p (within the dottedsystem box) corresponds to )% !"1 1 1 1 n n system Lines 3-11 of the SQL++ query. Noticegrid the use of the pa- !"#$%!"$%&'%()*% CS E grid system m ! m !V1...Vn ; system !"#$%%&%&'%(% ScNaT !t 0 NEST (O1...Oo,A1...Aa,N1...Nq ) rameter @nation_key in the correlated plan. According to T ! E t.X1!V1…t.Xn!Vn grid %'+,+-#%.%/012%+*% 0 !N grid the semantics of α, for each nation inputsystem tuple t, the cor- % system Ground !P !N '+,+-#%)% 1 1 related plan is evaluated with the parametergrid @nation_key !" grid /012%%&%% substituted by the corresponding valuesystem in t. The set of tu- !P !N system ,+/#%13#+0%41"5%(% q q ples resulting from the correlated plangrid evaluation becomes %'+,+-#%% grid system ! V1...Vn,O1...Oo, A1...Aa system the value of the new attribute aggregates. The function %%'#()('*6%% grid %%5789(+ )+ ,- )- ,/ )/ *%&'%/( grid date_part is evaluated in a λ operator. # & # . # 0 !V ...V ;Topk (" ) system %/012%%%(% 1 n n T1...Tt system grid %%'+,+-#%%% grid 4.2 Apply-plan Rewriter ! h system %%%'#()('*6% (( system (((+#()(+&,(( grid !V ...V G ...G ; grid grid (((1#234(-"(-#()(1.234(-"(-,(grid 1 n 1 k !P!N system A A system system system f1 (.)!A1... fa (.)!Aa (((5#(-"(/#()(50(-"(/0( grid grid grid %%/012%% 6%6( grid ! c system & system system E CS ! E %%!$+0+%7( system !" 1 1 grid !"#"$%&'& grid %%:013;%<=%'#)'*89#)9grid:( F grid systemCS ! E system system &&(& m m %%$&>"5:%;( system grid grid %%10?+0%<=%%#()(%<( grid grid &&&&)*%+&$!!&,!&('-.&& !P !N system 1 1 %%,"2"#%@% system ! &&&&'.&& system!" system grid %*%A8%B@@7C% grid &&&&)*%+&"# &,!&('-& grid ScNaT t grid system $ !Pq!Nq %:013;%<=%' ()(' system 0! system # *( t.X !V …t.X !V system &&&&!"#"$%&& *%&'%@7897D% 1 1 n n grid grid grid grid system %%%%%%&!%'%&()%% ! O1...Oo,A1...Aa 15%= >' (A@D%)%A@D%=system>' system # # * *( Ground system %%%%%%*!+,-%.#%.!%'%*/+,-%.#%./)%% grid Topkn grid system %%%%%%0!%.#%1!%'%02%.#%12% grid system grid &&&&/012&3% system Figure 9: Pattern output by apply-plan rewrite rule system grid grid ! T1...Tt system &&&&)+"0"&4% grid system grid &&&&30145&67&5!%'%56% system system grid ! h grid system &&&&+,8*93&7% Suppose attributessystem X1,...,Xn of expression E (i.e. the &&&&10:"0&67&8 %'%8 input of α) are the parameters of the correlated plan P . grid ! 9% ! grid system &&&&#*2*%&:% G1...Gk ; f1 (.)!A1... fa (.)!Aa The rewriter replacessystem the α with the plan of Figure 9, which 5 grid %%-&,!&9& intuitively proceedsgrid in the following steps. ! c system /012&;% First the input systemE of the α is evaluated and assigned to a grid grid system F temporary T . Insystem the running example, this means that na- tion keys and names are computed. See in Figure 10, which grid Figure 8: Pattern for Set-Processable Queries and Plans grid system shows the rewrittensystem plan of Figure 7, that the temporary T1 grid grid system The apply-plan rewriter inputs a plan (eg, see the pattern contains the nationsystem keys and names. Notice that the nation grid of Figure 8) and replaces tuple-at-a-time α operators with key would have togrid be computed even if it were not included system set-at-a-time operators, thereby producing plans as the one in the SELECT clausesystem of the outer query. The reason is that grid shown in Figure 9. the nation key is necessarilygrid an input of α since it is a pa- system system The following case analysis proceeds in the following steps rameter of its nested plan. of increasing complexity: Second, the subplan of the right hand side of the outerjoin of Figure 9 is computed. This subplan has one output tuple 1. We first present the special case where the FROM clause t = [X1 : v1,...,Xn : vn,N : P/(X1 : v1,...,Xn : vn) has no join or outerjoin expressions (called expression- for each distinct tuple of parameters v1, . . . , vn. This tuple free FROM case) and the plans of the α operators also contains a binding attribute N, which carries the result either do not involve or any aggregation or they in- P/(X1 : v1,...,Xn : vn) , i.e., the result of the execution volve both a group-by clause and aggregation (called of the nested plan P when its parameters are instantiated no-total-aggregation case). The analysis will reveal the to v1, . . . , vn. IN the running example, the right hand side potential requirements placed on the object references subplan has attributes nation_key and aggregates. and object id’s of the data. Notice that the subplan has emerged from rewriting the ground of the original nested plan with a duplicate-eliminating 2. Then we extend to FROM clauses that may involve ex- (δ) projection of the parameters in E. Conceptually, this pressions. This case encompasses the expression-free- guarantees that the subplan will produce all combinations FROM case. of parameters with tuples from the FROM clause of the nested plan P . (Again, we stress that one should think of this ex- 3. Finally, we present the total aggregation case where we planaion purely “conceptually”, since subsequent optimiza- remove α operators whose plans involve aggregation tions will introduce various optimization efficiencies.) without grouping. For example, a total aggregation A number of issues, pertaining to aggregation and top-k, case would be a modification of the running example have to be taken care of by the rewriting. In particular, if where the nested query involves no grouping and re- there were any grouping (γ) in the original plan, then its turns the total sum of sales for its nation parameter. grouping attributes have to also include the parameters. If The total aggregation case has a fundamentally differ- there were any ordering (τ) and top-K (Topk) they have ent rewriting from the no-total-aggregation case. 5Notice that the plan does not yet describe the fine details of 4.3 Expression-free FROM with no total ag- its execution, neither is optimized for execution yet. These gregation optimizations and details will be added later. to be replaced by a partition (χ) operator. In the running parameteri v”, whereas the variable v ranges over the mem- example, the top-3 sales dates computation is achieved by bers of a collection that is provided as a parameter from the partitioning the input by the nation_key parameter and outer level. The algebraic counterpart will be a plan of the then keeping the top-3 of each partition. form αP 7→N E, where the nested plan P is Scanc7→v Ground Finally, the natural left outer-join ./ combines by join- and the outer plan E produces a binding attribute c. Note ing on the parameters, the temporary T (which intuitively that we deliberately focus on the scan corresponding to the carries the result of the outer query, along with parameters) nested query’s FROM. It will be obvious how the point below with the right hand side subplan (which carries the result applies even in the presence of selections, groupings, joins, of the inner query, also along with parameters). Simplifica- additional scans etc. tions of the rewrite rule are easy to produce for when one The elimination of α in this example requires that the pa- or more of the assignments, π, Topk, τ, σ or γ operators are rameter c appears in the group-by list. From an implemen- absent. tation point of view, this is efficient (and therefore worthy grid In the case that the plan P of the α contained any as- of reducing tuple-at-a-time to set-at-a-time)grid only if c is a system signments, these assignments would simply be set-processed directly or indirectly identifiable object.system We call it directly grid (by application of the exact same rewritings that we use to identifiable if c is an object. We call it indirectlygrid identifiable system system rewrite the result expression) and would be computed before if it is only the collection value of an attribute of a tuple grid grid system the computation of the outerjoin. that is an object (i.e., identifiable). Insystem the indirect case, the tuple’s id can effectively operate as a proxy for the missing grid grid system T1 Main plan id of c itself. system grid In either case, the source has to eithergrid explicitly support system !nation_key, name the export of id’s to Collage, so that thesystem grouping according grid to c is executed at Collage, or, the sourcegrid needs to be able to system ScNa !nation_key; system ! selected = true AND T1 ! t; efficiently group according to object references/id’s, so that nation_ref = nation_key t.nation_key, NEST(order_year, sum_price) grid t.name ! aggregates grid ScNa Collage can defer to the source. The Java runtime is such a system session Ground !nation_key, system .selected_nations ! s; order_year, sum_price source, in the sense that it enables an implementation where grid s.nation_ref, grid s.selected system !nation_key;Topk3 ("sum_price ) Collage can group by object references.system ScNadb.nations ! n; As an example of the above case consider the following grid n.nation_key, grid system n.name !nation_key, order_year; query system SUM(total_price) ! sum_price grid Ground grid system !date_part('year', order_date) SELECT c.name, (SELECT n.* FROM c.nationssystem n) FROM session.countries c ! order_year grid grid system ! nation_ref = nation_key AND system cust_ref = cust_key This query is set processable only if c.nations is an object grid grid ScNadb.customers ! c; (direct case) or c is an object (indirect case). system c.cust_key, c.nation_ref system grid grid ScNadb.orders ! o; 4.4 The total aggregation case system o.cust_ref, o.order_date, system o.total_price An extra challenge is introduced by nested queries that grid grid system ! feature aggregation but do not featuresystem grouping. For exam-

grid ple, consider the following query that computesgrid the sum of ScNaT1 ! t; system t.nation_key sales per customer. system grid Ground grid system SELECT cust_key, ( system SELECT SUM(o.total_price) AS sum_price Figure 10: Plan after apply-plan rewriting FROM db.orders o We call this type of plan a normalized-sets plan, i.e. for WHERE c.cust_key = o.cust_ref

γG1,...,Gn;NEST, the grouping attributes includes only output FROM db.customers c attributes from E that are in correlated attributes X1,...,Xn. For example in Figure 10, the γNEST has only grouping at- Then consider a customer c that has made no order yet. tribute nation_key which corresponds to the correlated at- By following the provided α-removal rule we derive a plan tribute, whereas the non-correlated name is deferred until af- where the sum of sales is null, as opposed to the correct, ter the ./. Section 4.8 explains why these plans are superior which is zero. In such cases where there are no grouping than alternate denormalized-sets plans where the grouping attributes, any null that appears in lieu of the value of the attributes of γNEST also includes non-correlated attributes, aggregate f should be replaced with the result of this func- i.e. the input of γNEST is denormalized with respect to the tion f when it is given an empty group We omit the algebraic correlated attributes. specifics of performing this change of value. When there are more than one α operator in a plan, the rewriter eliminates them by repeated application of the 4.5 Unrestricted FROM clause rewrite rule in a top-down manner: an α that appears in a Next we consider FROM clauses that contain join and/or nested plan is eliminated after its parent α has been elimi- outerjoin expressions. While joins can be rewritten away, nated. outerjoins cannot be rewritten away and therefore we will Requirements of set processability on object id’s and focus on them. We will first consider the case where the references Under certain circumstances described next, the FROM clause of the nested query is solely a single outerjoin.

reduction of tuple-at-a-time processing to set-at-a-time pro- In particular, consider the expression αl./cr7→N E, where the cessing (set processability) is dependent on the availability left hand side l of the outerjoin involves parametersx ¯l and of object references and object-id’s at the source data. In the right hand side r of the outerjoin involves parameters particular, consider a nested FROM clause “FROM hcollection x¯r. If the right hand side r is uncorrelated, i.e.,x ¯r = ∅, ing whether it is executed in-memory or in-database. then the α removal proceeds similarly to the expression- The distributor is tuned for characteristics common to

free FROM case. Namely, the temporary T = δπbarxl,x¯r E most web applications: persistent data in the database data is computed first. Then the ground of l is replaced with are orders of magnitude larger than both (1) the transient T (the resulting expression denoted as l(T )) and the usual memory data input by the SQL++ query, and (2) the output combination of γNEST and outerjoin accomplishes the required of the SQL++ query as displayed on the page. Therefore, nesting. In summary, the rewritten plan is the source decision of an operator is determined through bottom-up sub-plan maximization as follows: (i) If an oper- ator is source-specific, it is decided to be executed in that T 7→ δπbarx ,x¯ E; l r source. (ii) If an n-ary operator (i.e. join, ∪ and ∩, which are E ./γ ¯ (l(T ) ./c r) x¯l;NEST(O) all flexible) has children distributed across both sources, the where O is the list of binding attributes of l and r. operator is decided to be in-database, and the in-memory If the right hand side r is also correlated (i.e.,x ¯r 6= ∅) then inputs are copied to the database. This is because copying the ground of r is also replaced with T . Note however that in in-memory inputs to the database is much faster than the order to avoid cardinality errors in the result, a tuple of l(T ) opposite. (iii) Otherwise, the operator is flexible and has should match a tuple of r(T ) only if they are derived from children within a single source, therefore the operator is de- the same tuple of T . Technically this intuition is captured by cided to be in the same source as its children, in order to changing accordingly the condition of the ./. In particular, recursively maximize the size of sub-plans. 0 ¯0 let us denote by r the plan r/(x ¯r 7→ xr), i.e., the plan Figure 11 shows the plan after the distributor has deter- that results from the substitution of the parametersx ¯r with mined source decisions, as denoted with db or mem. Cuts (i.e. 0 parameters with fresh namesx ¯r . Then the needed rewriting data transfers) between a pair of adjacent in-memory and in- is database operators are denoted with dotted perpendicular lines.

T 7→ δπbarxl,x¯r E; 0 4.7 Plan-to-SQL Translation E ./γ ¯ (l(T ) ./ ¯0 r (T )) grid x¯l;NEST(O) x¯r =xr ∧c grid system system It is straightforward to see how the above generalizes to T db Pmem grid grid i ! i system the case where the plan of α were not simply an outerjoin system db mem T j ! Pj grid but also included grouping, top-k etc. grid system system Pmem Pmem 4.6 Distributor k k grid grid mem system systemSe ndQuery Pdb grid db T1 grid system Main plan system  grid db mem grid system !nation_key, name system P!db db mem db mem P nation_ref = j grid mem!nation_key; Pi grid system nation_key ScNaT1 ! t; NEST(order_year, sum_price) system db t.nation_key, ! aggregates mem grid ScNadb.nations db t.name grid db system ! n; system ! selected n.nation_key, Ground nation_key, Figure 12: Translating Maximal Sub-plans to SQL = true !   n.name order_year, sum_price grid mem db db Figure 12 illustrates how the plan-to-SQLgrid translator trans- system Ground !nation_key;Topk (" ) system ScNasession 3 sum_price grid .selected_nations ! s;  lates maximal sub-plans to SQL. Givengrid source decisions on s.nation_ref, db system mem s.selected !nation_key, order_year; each operator, the plan-to-SQL translatorsystem performs the fol- SUM(total_price) ! sum_price db grid Ground db lowing recursively: Each maximal in-databasegrid sub-plan P mem system !date_part('year', order_date) is replaced by a SendQuery operator,system where S is the SQL ! order_year S db db grid nation_ref = statement translated from P . As agrid special case, when system nation_key mem system db SendQueryS is the root operator of e in an T ← e, a tem- grid db grid cust_key = db porary table T is created directly from the results of S with- system cust_ref system ! out any data transfer. Each maximal in-memory sub-plan  grid db db db mem db grid db mem system ScNa Q that is an input to P is movedsystem to a new U ← Q , ScNadb.customers ScNadb.orders ! o; T1 ! t; t.nation_key grid ! c; o.cust_ref, which creates a temporary table U in thegrid database and bulk c.cust_key, o.order_date, system c.nation_ref o.total_price copies the result of evaluating Q intosystemU. U is subsequently db db db grid utilized by S. The final plan of the runninggrid example as system Ground Ground Ground output by the plan-to-SQL translator issystem shown in Figure 13. Figure 11: Plan after distributor’s source decisions 4.8 Alternate Denormalized-sets Plan The distributor decides the source where each operator To illustrate an alternate rewrite rule for replacing α op- will be executed, with the objective of outputting efficient erators, consider the denormalized-sets plan in Figure 14, physical plans that limit the number of tuples transferred which is Figure 7 rewritten with the alternate rule, and aug- between sources. Operators are either source-specific or mented with source decisions for ease of exposition. The flexible. A source-specific operator is: (1) a ScNa (2) an α operator is replaced, but with key differences from the α (3) a λf where f is a source-specific function, or (4) normalized-sets plan of Figure 10. As quantified by ex-

a γG¯;f1→N1,...,fm→Nm where one or more of f1 . . . fm are periments in Section 5.2, this denormalized-sets plan has source-specific. All other operators are flexible. For each two inherent performance limitations as compared to the operator, the distributor provides a source decision indicat- normalized-sets plan. grid grid system system

grid grid system system

grid grid system system db T2 Main plan additional tuples will similarly increase subsequent process- grid grid system mem mem ing and data transfer costs. In contrast,system the normalized-sets ! selected = true  grid mem mem plan avoids these redundancies by addinggridScan and δ opera- system mem system ScNasession SendQuery !nation_key; tors (which is reminiscent of the classical semi-join technique .selected_nations ! s; NEST(order_year, sum_price) grid s.nation_ref, !"#"$%&'()*+',-./0&& ! aggregates in distributed query processing, and deferringgrid ./ till after all system s.selected &&&&&&&'(5.& system  other operators. mem 89:;&&&%Q& mem grid grid Ground SendQuery Second, recall that ./ is not a commutative operator, system and thus restricts join reordering optimizationssystem 6. In the grid !"#"$%&&'()*+',-./0&+12.1,/.(10&345,61*7.& grid db denormalized-sets plans, ./ is decided to be in-database, system T1 89:;&&&&<& system &!"#"$%&,=>?0& thus the resulting SQL query specifies (T3 ./ nations) ./ grid !"#"$%&'()*+',-./0&&& &&&&&&&&1+@,'45A.1&:B"9&<& grid system &&&&&&&'(5.&& &&&&&&&&&&CD9%E%E:F&GH&,=>'()*+',-./& (customers ./ orders), where T3 (notsystem shown in figure) is a 89:;&&&'()*+'3&D!&'& &&&&&&&&&&:9I"9&GH&,=>345,61*7.& grid temporary table bulk loaded with session.selected_nationsgrid N:EF&&&%=>'()*+',1.P&O& &&&&&&&&J&D!&1+@,*2& system system &&&&&&&'>'()*+',-./& &89:;&&&<& tuples. Due to ./, no join reordering occurs, thus the database grid &&!"#"$%&&&'()*+',-./0& first executes the expensive join betweengrid customers and system &&&&&&&&&&&2().,6(1)+12.1,2().J& system &&&&&&&&&&&D!&+12.1,/.(10& orders, which produces as many tuples as orders. In con- grid &&&&&&&&&&&!L;<+>)+)(M,61*7.J&D!&345,61*7.& trast, the normalized-sets plans places ./gridabove the memory- system &&89:;&&&&&743)+5.13&D!&7& system &&N:EF&&&&&+12.13&D!&+&:F&7>743),-./&O&+>743),1.P& specific γNEST, causing the distributor to decide ./ to be in- grid &&N:EF&&&&&<&!"#"$%&IE!%EF$%& memory. The resulting SQL query only hasgrid ./ operators, i.e. systemgrid &&&&&&&&&&&&&&&&&&&&'()*+',-./& systemgrid system a three-way join T1 ./ customers ./ orderssystem , where T1 has grid &&&&&&&&&&&&&89:;&&&%Q&J&D!&,Q& grid &&&&&&&&&&&:F&7>'()*+',1.P&O&,Q>'()*+',-./& systemgrid nation_keys of selected nations (Figuresystemgrid 13). The database &&R9:LC&GH&'()*+',-./0& system system grid &&&&&&&&&&&2().,6(1)+12.1,2().J& reorders joins and first executes T1 ./ customersgrid , which is systemgrid &J&D!&,=& an order of magnitude more selectivesystemgrid than customers ./ system J&D!&,S& system grid TU"9"&&&1+@,*2&VO&S& orders, thus producing a substantially moregrid efficient plan. systemgrid systemgrid system system Figure 13: Plan after translating in-database sub-plans to grid 5. EXPERIMENTAL EVALUATIONgrid SQL system We demonstrate the benefits of holisticsystem optimization by grid grid mem experimentally evaluating Collage’s performance speedup on system !nation_key, name; system NEST(order_year, sum_price) ! aggregates the running example of Figure 1. All experiments are run grid grid system db with PostgreSQL 9.2 on Mac OS X 10.6,system using 512 MB of !nation_key, name, order_year, sum_price grid RAM for PostgreSQL’s memory buffersgrid to cache random  system db system !nation_key, name;Topk3 ("sum_price ) disk accesses, and 1 GB of RAM for OS X’s disk cache grid db to cache sequential disk accesses. Withgrid a scale factor of system !nation_key, name, order_year; system SUM(total_price) ! sum_price 3 for the TPC-H benchmark [22], the data generator pro- grid duces a database that contains 25 nations,grid 450,000 cus- system !date_part('year', order_date) system ! order_year tomers and 4,500,000 orders. Running times are based on grid db grid nation_key = warm buffers/caches. system nation_ref system Four queries/plans are evaluated: (1) RoR denotes the grid  grid db  cust_key = cust_ref system !nation_key, name db tuple-at-a-time queries produced by thesystem RoR code of Fig- db db db ure 3 (2) T denotes the initial tuple-at-a-time plan in Fig- grid nation_ref = ScNa grid system nation_key ScNadb.customers db.orders ! o; ure 7, (3) NS denotes the normalized-setssystem plan in Figure 10 ! c; o.cust_ref, grid mem db c.cust_key, o.order_date, as output by the apply-plan rewrite rulegrid (4) DS denotes the ScNa c.nation_ref o.total_price system ! selected db.nations db db system = true ! n; denormalized-sets plan in Figure 14 as output by the alter- grid mem  n.nation_key, Ground Ground grid ScNasession db n.name nate rewrite rule. system .selected_nations Ground Section 5.1 shows the speedup of set-at-a-timesystem over tuple- mem ! s; grid s.nation_ref, grid system Ground s.selected at-a-time semantics by comparing thesystem running time of the rewritten NS against the initial T and mainstream RoR. Sec- Figure 14: Denormalized-sets plan by alternate rewrite rule tion 5.2 validates our choice of rewrite rules by demonstrat- ing the superiority of NS over the alternate DS.

First, observe that for αP →N (E) where E has correlated 5.1 Tuple-at-a-time vs Set-at-a-time plans attributes X1,...,Xn, there is a functional dependency from In Figure 15, the number of selected nations k is varied X1,...,Xn to the attributes of N. For example, from 5 to 25 to show the speedup of set-at-a-time versus nation_key → order_year,sum_price. In the denormalized- tuple-at-a-time semantics at different sizes of results dis- sets plan, ./ denormalizes intermediate results by joining played on pages. Due to their tuple-at-a-time semantics, on X ,...,X , thus introducing more non-correlated at- 1 n both RoR and T issue k queries, i.e. one query per selected tributes and/or tuples from its LHS. This creates two types nation. The join between customers and orders, which is of performance penalties in Figure 14. (1) Horizontal redun- an expensive operation in each query since the low join se- dancy: ./ outputs wider tuples due to the additional name lectivity causes Postgresql to scan both tables sequentially, attribute, which increases processing costs of the γSUM, χ, π is evaluated k times in total. Not only do the running times and γNEST operators above, as well as the data transfer cost across the cut. (2) Vertical redundancy: ./ outputs more 6While database researchers have shown that rewriting ./ tuples than its RHS. If the correlation attributes were other and other join operators to a “generalized-join” operator can than nation_key and have duplicate values in the LHS, each enable further join reordering [12], we did not find a RDBMS RHS tuple will be copied once per duplicate value, and the that implements this optimization. ! "# "! $# $! $%&'() !"# $"! %&"' %(") %*"( *++,-%,'. &&"! )*"( #!"! ''"* %+#"' /0/ &)"( )!"# !)"& $'"% %%#"( 1+))23+ 45167(6/0/8 ("& )"$ *"' !"( !"#

%)+, due to the added partition χ operator. However, at the low %&+, selectivities of fanouts 4,500 and 22,500, RoR and T incurs %++, sequential scans of customers and index scans of orders, '+, whereas NS incurs sequential scans of both customers and 9:;)64()<0.2(86 #+, orders. This is a classical example of when low selectivity )+, leads to sequential scans being more efficient than multiple &+, index lookups, thus NS shows a speedup of 4.5-6.3x. At the +, *, %+, %*, &+, &*, lowest selectivity of fanout 112,500, all plans incur sequential -.-, &)"(, )!"#, !)"&, $'"%, %%#"(, scans for both tables, thus the reasons for speedup is iden- /, &&"!, )*"(, #!"!, ''"*, %+#"', tical to that of Figure 15. These data points illustrate that 01, !"#, $"!, %&"', %("), %*"(, when databases increase in size, the consequent increase in 1233452, ("&, )"$, *"', !"(, !"#, fanouts (and hence lowering of selectivities) in aggregation 601,78,-.-9, =>653;?)@60A6(),) !"* $"( )"* &"# Table 1 compares the running times between DS and NS. Two parameters are varied: (a) the width of a tuple in the $!+ nation table along a logarithmic scale, by changing the char- *!+ acter length of the name column (b) the fanout of customers

)!+ per nation, by using the identical methodology and fanout values as in Section 5.1. The number of selected nations (!+ is also fixed at 10. Both parameters respectively highlight &!+ the two performance limitations of DS as explained in Sec- ?$@3<;23A%&:2>< %!+ tion 4.8. +++++++++++++++,,-./0,*!"12-3415.6+78+12-3415.69+ !+ From top-to-bottom for increasing tuple widths, the run- P!!+ )*!!+ &&9*!!+ %%&9*!!+ +++++++++++++++,,-./0,*%"4,4:;.:;2-.+78+4,4:;.:;2-.9+ +++++++++++++++8<=>7??+,,-./0,*%"4,-4-2@0:3A.B+78+CD/,A2@@,,E&#ning time of increases proportionally to the tuple width, G4G+ !")+ (#+ *!+ ()+ DS ++++++++++++FGH=whereas NS is unaffected. This illustrates the horizontal re- K+ !"'+ ((+ )(+ &)+ ++++++++++++++++++> M8+ !"#+ $+ %%+ %&+ +++++++++++++++++++++8I?IJK+7??dundancy experienced in DS, and the effectiveness of the 80..;D0+ ++++++++++++++++++++++++,,02:2/,("1,12-3415.6+78+12-3415.6Scan and δ operators in NS for avoiding these redundan- !"*+ $"(+ )"*+ &"#+ >M8+EC+G4GB+ +++++++++++++++++++++FGH=cies. Note that vertical redundancy is not demonstrated ,"&%-#B From left-to-right for increasing fanouts (i.e. lower nation- ++++++++++++++++++++++++8I?IJK+7?? Figure 16 shows how different join selectivities in the database+++++++++++++++++++++++++++4"4,ADC-5.6+78+4,ADC-5.69+customer selectivities), the running times for both DS and affects speedup. To vary join selectivities, different TPC-H+++++++++++++++++++++++++++4"4,-4-2@0:3A.+78+4,-4-2@0:3A.9+NS increase. Recall that DS executes (T3 ./ nations) ./ +++++++++++++++++++++++++++4"4,4:;.:;2-.+78+4,4:;.:;2-.(customers ./ orders) without the benefit of any join re- database instances are created by changing the total num-++++++++++++++++++++++++FGH= ber of nations from 25 to 500, 100, 20 and 4 respectively.+++++++++++++++++++++++++++0DO@3A"4:;.:C+78+4ordering. With a lower selectivity, the ./ on the large in- The total number of customers and orders remain the same,+++++++++++++++++++++B+78+,,-./0,*%termediate result customers ./ orders becomes more ex- +++++++++++++++++++++LMMIG+NHLM+pensive, thus DS increases in running time. On the other but customers are evenly re-distributed among the avail-+++++++++++++++++++++> able nations. This produces a logarithmic scale of nation-++++++++++++++++++++++++8I?IJK+7??hand, NS executes (T1 ./ customers) ./ orders after join to-customer fanouts: 900, 4,500, 22,500 and 112,500. The+++++++++++++++++++++++++++A"A,ADC-5.6+78+A,ADC-5.69+reordering, thus it is affected to a lesser extent by the lower +++++++++++++++++++++++++++A"A,12-3415.6+78+A,12-3415.6selectivity since customers is an order of magnitude smaller number of selected nations is fixed at 10, except in the++++++++++++++++++++++++FGH= last fanout where only 4 nations can be displayed. Higher+++++++++++++++++++++++++++0DO@3A"ADC-4/.:C,*!!+78+Athan customers ./ orders. nation-to-customer fanout corresponds to lower selectivity These data points illustrate that when tables become wider for nation ./ customers. and fanouts increase, NS will have increased benefits over At the high selectivity of fanout 900, PostgreSQL uses DS, thus providing empirical justification for Collage’s apply- index lookups against both customers and orders for all plan rewriting to output NS plans. of RoR, T and NS. In this case, NS does not increase perfor- mance over RoR, and in fact incurs a 0.1s penalty over T 6. RELATED WORK database server. Database and programming language research work has investigated the problem of efficient and/or single-language 7. REFERENCES access, both before and after the advent of ORMs. Single access via semistructured query languages Strudel [11][1] M. O. Akinde and M. H. B¨ohlen. Efficient showed that content-publishing web pages can be conve- computation of subqueries in complex olap. In ICDE, niently specified using a semistructured XML query lan- pages 163–174, 2003. guage. Later, XPERANTO [21] and SilkRoute [10] ad- [2] Anonymized due to double-blind requirements. dressed the optimization of XML queries that produce nested [3] Anonymized due to double-blind requirements. results by accessing an SQL database. The semistructured [4] Anonymized due to double-blind requirements. query is executed by emitting one or more SQL queries to [5] A. Badia. Computing SQL queries with boolean the database and consequently combining the results. aggregates. In Data Warehousing and Knowledge Unlike Collage, they do not consider analytics queries (i.e., Discovery, pages 391–400. Springer, 2003. queries with group-by, top-k), memory-database distribu- [6] B. Cao and A. Badia. A nested relational approach to tion or navigations into external objects (and the compli- processing SQL subqueries. In Proceedings of the 2005 cations they introduce on the reduction of tuple-at-atime ACM SIGMOD international conference on to set-at-a-time). Due to the limited (with respect to Col- Management of data, pages 191–202. ACM, 2005. lage) of their queries and the respective limited scope [7] A. Cheung, O. Arden, S. M. A. Solar-Lezama, and of their optimization decisions, they do not need Collage’s A. C. Myers. Statusquo: Making familiar abstractions algebra rewriting-based query processing engine. perform using program analysis. CIDR, 2013. In the context of non-distributed select-join nested queries [8] W. R. Cook and A. H. Ibrahim. Integrating (i.e., nested queries over just an SQL database, group-by and programming languages and databases: What is the top-k) XPERANTO, Silkroute and Collage still have differ- problem. In In ODBMS.ORG, Expert Article, 2005. ences. XPERANTO produces an SQL query q for each col- i [9] U. Dayal. Of nests and trees: A unified approach to lection ci of the nested query (be it the top level connection or a nested collection) by left outer-joining all the FROM- processing queries that contain nested subqueries, aggregates, and quantifiers. In VLDB, pages 197–208, WHERE clauses on the path to ci. Unlike Collage, each sub-query inefficiently uses the entire table produced for the 1987. parent collection although only the correlating attributes are [10] M. Fernandez, A. Morishima, and D. Suciu. Efficient required. evaluation of XML middle-ware queries. In ACM Tuple-at-a-time to set-at-a-time reduction in SQL SIGMOD Record, volume 30, pages 103–114. ACM, WHERE 2001. Many research efforts have been devoted to the sub-query [11] M. F. Fern´andez, D. Florescu, A. Y. Levy, and decorrelation in the WHERE clause since the 1980’s. Kim [19] D. Suciu. Declarative specification of web sites with first developed the query transformation algorithms to rewrite Strudel. VLDB J., 9(1):38–55, 2000. nested queries into equivalent, flat queries which can be [12] C. Galindo-Legaria and A. Rosenthal. Outerjoin processed more efficiently. Dayal [9], Muralikrishna [20], simplification and reordering for query optimization. Ganski-Wong [14] extended the work to support subqueries ACM Transactions on Database Systems (TODS), with aggregates and more than one level of nesting. 22(1):43–74, 1997. Galindo-Legaria and Joshi [13] represents the subqueries [13] C. A. Galindo-Legaria and M. Joshi. Orthogonal in SQL Server using apply operator (as Collage does), and optimization of subqueries and aggregation. In then removing the correlations by transforming them into SIGMOD Conference, pages 571–581, 2001. joins,semi-joins, outer-joins, etc., according to a set of rules [14] R. A. Ganski and H. K. T. Wong. Optimization of built upon. Other works [1, 6, 5] consider the prob- nested SQL queries revisited. In SIGMOD Conference, lem when correlated subqueries occur within positive and pages 23–33, 1987. negative existential (ANY, EXISTS, IN) or universal (ALL) [15] H. Garcia-Molina, J. D. Ullman, and J. Widom. quantification. The techniques all involve extending the Database Systems: The Complete Book, chapter 5, standard relational algebra. We consider them as comple- pages 205–241. Prentice-Hall, 2009. mentary to work ours. [16] T. Grust. Purely relational flwors. In XIME-P, Optimization of database-accessing application code volume 37, 2005. A second line of research adopts a conventional architecture, [17] T. Grust and M. Mayr. A deep embedding of queries where code written in a conventional application program- into Ruby. In ICDE Conference, 2012. ming language (e.g., Java) issues SQL statements directly [18] T. Grust, M. Mayr, J. Rittinger, and T. Schreiber. or indirectly through the use of an ORM. These works ana- Ferry: database-supported program execution. In lyze the code and, consequently, they generally rewrite the SIGMOD Conference, pages 1063–1066, 2009. application code to improve data access performance. [19] W. Kim. On optimizing an SQL-like nested query. Loop lifting [16] is a technique originally developed to ACM Trans. Database Syst., 7(3):443–469, 1982. handle XQuery FLOWR construct using relational algebra plans. Later, it was generalized and extended to the FERRY [20] M. Muralikrishna. Optimization and dataflow compilation framework [18], with a richer data model and algorithms for nested tree queries. In Proc. Int. Conf. more language constructs to handle queries in more general on Very Large Data Bases (VLDB), volume 516, 1989. programming languages. [21] J. Shanmugasundaram, E. Shekita, R. Barr, M. Carey, Finally, StatusQuo [7] automatically partitions the ap- B. Lindsay, H. Pirahesh, and B. Reinwald. Efficiently plication computation between the application server and publishing relational data as XML documents. The VLDB Journal, 10:133–154, 2001. [22] Tpc-h homepage. http://www.tpc.org/tpch/. [23] Wikipedia. Template processor, 2013. Accessed Aug 16 2013. http://en.wikipedia.org/w/index.php? title=Template_processor&oldid=567590946. [24] F. Yang, J. Shanmugasundaram, M. Riedewald, and J. Gehrke. Hilda: A high-level language for data-drivenweb applications. In ICDE, page 32, 2006.