Holistic Data Access Optimization for Analytics Reports∗
ABSTRACT stream ORM. It also shows that normalized-sets plans out- Object-Relational Mappers (ORMs) enable single language perform denormalized-sets plans by manyfold for the case of access to both the main memory data and the database data analytic reports, which was not the case in semistructured of an application. Unfortunately, they also lead to perfor- queries without aggregation. mance inefficiencies, especially in analytics applications with The Collage query processor has been integrated and de- information-rich reports involving nested and aggregated re- ployed as part of a complete web application development sults over large data volumes. framework, including a templating enegine. Using an online Past database research suggests that a report over a database IDE, one can build a wide class of web applications with can be modeled by a single semi-structured query, i.e. a both the ease of single language access, and performance of query involving nesting and heterogeneity. Thus, the re- holistic optimization. port’s data accesses can be holistically optimized through a single query. Collage resolves important practical challenges 1. INTRODUCTION towards enabling the report-as-query approach and its holis- It has been well-known since the 80s that the impedance tic optimization. In particular, the Collage middleware sys- mismatch between the SQL database and the application tem models a report page as a SQL++ query. SQL++ is layer is a major time waste in application development [8]. A a superset of SQL by two extensions: (a) It provides dis- developer has to use two languages (SQL and an application tributed access to an SQL database and the in-memory ob- programming language, such as Java, Ruby or Python) for jects of the web application programming language. (b) The data access, and tediously convert between relational tables query output has the full generality needed by report tem- and objects. plating engines built around HTML and JSON. Indeed, the Best practices among practitioners to mitigate impedance SQL++ data model is a superset of JSON. The Collage mismatch are to utilize Object-Relational Mappers (ORMs). implementation features its own report templating engine With ORMs, developers specify mappings from relational but one can easily utilize Collage’s query processor in other tables into classes of the application programming language. HTML and JSON-based templating engines. Consequently, the ORM automatically translates navigation On the query optimization side, Collage expands into an- across objects of these classes into SQL queries, often involv- alytics queries. The following query optimization problems ing joins. With this object navigation paradigm, ORMs offer emerge and are solved. (1) Novel rewritings guarantee the single language access to all data, be them native objects of compilation of any SQL++ query into an efficient set-at- the application or tuples of the database. The pervasive a-time query plan, assuming the objects of the application adoption of ORMs1, such as Hibernate, Active Record of programming language offer object id’s that can be com- Ruby-on-Rails (RoR) and CakePHP is a testimony to the pared and be efficiently grouped. This guarantee includes importance of the impedance mismatch problem, and the analytics queries, i.e., queries that involve aggregation and desire for single language access to database tuples and ap- top-k processing, which were unaddressed by prior works on plication objects. semi-structured queries. (2) Normalized and denormal- While ORMs provide single language unified access for ized set-at-a-time query plans are formulated and evaluated. simple data access patterns, this ease of use often comes The evaluation validates the manyfold speedup over a main- at a performance cost of queries in very common reporting ∗ applications, especially analytics applications that display Supported by NSF III-1018961 and NSF III-1219263, PI’d nested, aggregated data [18, 17]. As a running example, con- by Prof Papakonstantinou who is a shareholder of App2You Inc, which commercializes outcomes of this research. sider the web application in Figure 1 that has a dashboard page for visualizing and analyzing sales data of a TPC-H [22] database. A user uses the checkboxes to select nations to analyze, and the application stores the selections as in- memory objects within the HTTP session of the application Permission to make digital or hard copies of all or part of this work for server. The application then displays a HTML table of se- personal or classroom use is granted without fee provided that copies are lected nations, where each row comprises the nation name not made or distributed for profit or commercial advantage and that copies 1 bear this notice and the full citation on the first page. To copy otherwise, to In this paper, code examples are presented in RoR, since republish, to post on servers or to redistribute to lists, requires prior specific RoR is a widely used web framework, and Active Record, its permission and/or a fee. built-in ORM, supports many mainstream database engines. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. Nevertheless, the discussion applies to other popular ORMs. and the top 3 years of sales revenue. almost always incurs a manyfold performance penalty due to extraneous sequential scans or repeated random accesses, -.$/%-#) as will be demonstrated in our experiments. Worse still, the -.$/%-*+',) performance penalty ratio increases with the amount of data -.&') !%&&'-$) accessed and visualized, since the number of queries issued is dependent on the number of nations. !"#$%&'(#) A sophisticated developer who optimizes for performance will re-implement tuple-at-a-time queries as set-at-a-time !"#$*+',) -.$/%-*('0) queries. Two alternatives demonstrated in this paper are: .11('##) • Denormalized-sets queries: Queries that retrieve re- %(1'(#) sults as denormalized-sets. For example, a single query %(1'(*+',) that retrieves the join of nations and top sales. !"#$*('0) %(1'(*1.$') • Normalized-sets queries: Queries that retrieve results $%$.2*3(/!') as normalized-sets. For example, a query that retrieves Figure 1: Analytics application for TPC- Figure 2: nations, and a separate query that retrieves top sales with corresponding nation ids. H data TPC-H schema An ORM induces developers to decompose data access ! "#$%&'#()*)+,)-#./,0%.#1)'(#23%4#4)4%3'#,%#-/,/5/()#,)4&%3/3'#,/5*)(# into multiple object navigations (translated by the ORM 6 7+,08)9)+%3-::;/()<+%..)+,0%.<)=)+>,)?# into multiple queries) that are interspersed among loops and @ ##A$9B7CB#CBDEF979G#C7;HB#()*)+,)-I./,0%.(#?./,0%.I3)2#JKCBLB9MA#M# N ()*)+,)-I./,0%.(<)/+O#-%#P.P# conditionals of imperative languages. This decomposition Q ##7+,08)9)+%3-::;/()<+%..)+,0%.<)=)+>,)?# R ####SJKTB9C#JKCF#()*)+,)-I./,0%.(?./,0%.I3)2M#U7HVBT#?A"W.XAMS#M# reduces data access performance: Even though each individ- Y ).-# ual query may be optimally executed by the database, the Z "#T,%3)#+%44%.#(>5[)=&3)((0%.#2%3#()*)+,)-#./,0%.(#0.#,)4&%3/3'#,/5*)(# set of queries collectively may not compute the report effi- \ 7+,08)9)+%3-::;/()<+%..)+,0%.<)=)+>,)?]]T^H# ciently. Such inefficiencies can be very large in analytics ap- !_ ##$9B7CB#CBDEF979G#C7;HB#,)4&I./,0%.(#7T# !! ##TBHB$C#.<./,0%.I1)'`#.<./4)# plications producing reports that involve nesting, joins and !6 ##a9FD###./,0%.(#7T#.# aggregation. In particular, this paper focuses on the ineffi- !@ ##bFJK###()*)+,)-I./,0%.(#7T#(#FK#(<./,0%.I3)2#c#.<./,0%.I1)'# !N T^H# ciencies of tuple-at-a-time queries, where each outer context !Q M# tuple causes the execution of an inner query that produces !R "#T>5[d>)3'#,%#3),30)8)#-0(,0.+,#./,0%.#1)'(# nested tuples. !Y d!#c#ATBHB$C#eJTCJK$C#./,0%.I1)'#7T#./,0%.I1)'I6#a9FD#,)4&I./,0%.(A#
!Z "#T>5[d>)3'#,%#3),30)8)#/ff3)f/,)(#).[4/(()# ! "#$$%&'()*+,'%-./%01'023.4'5%$$6% !\ d6#c#73)*::C/5*)<.)g?:+>(,%4)3(M# 7 ",+3*'6% 6_ ##<&3%h)+,?A,!<./,0%.I1)'I6`## 8 %%"9%!"#$%&'"((')"*+',%-.&.-96% 6! ############-/,)I&/3,?iA')/3iA`#%<%3-)3I-/,)M#/(#%3-)3I')/3`## : %%%%"9%$/-0)00$%&120)()*#),3&"#$%&0241&'&"#$%&35)64-77-#89)-#+)&-96% 66 ############TVD?%<,%,/*I&30+)M#7T#(>4I&30+)AM# ; %%%%%%",/6% 6@ ##<23%4?A+>(,%4)3(#/(#+AM# < %%%%%%%%",=6"9>%&'&":)%96"?,=6% 6N ##
1.1 Contributions This paper focuses on the distributed query processor of Collage, which builds upon existing techniques to introduce 2The machine is hosted anonymously in the AWS cloud novel query processing optimizations: service in compliance with SIGMOD’s double-blind require- Optimizing tuple-at-a-time queries We show that all ments. 2. ARCHITECTURE, SYNTAX AND SEMAN- ! "#$%&%'()%*+%,-./%01$*2(343,%)23$15+ 6 ++!"#"$%&&'(')*+,'-./01&'(')2/1&3& TICS 7 ++++++++++++!"#"$%&&&45(4)*/-6)7*380/)781&,(,74/7-4)*/9&:!&,74/7-0/)71& 8 +++++++++++++++++++++!;<3,(*,*)=-67+>/9&:!&?@2-67+>/& Collage provides an efficient template engine [23] for web 9 ++++++++++++ABC<&&&&&45(,74/7?&:!&,1& : ++++++++++++&&&&&&&&&45(>@?*,2/7?&:!&>& frameworks that follow the Model-View-Controller (MVC) ; ++++++++++++DE"B"&&&&,(>@?*-7/F&G&>(>@?*-./0&:HI& architecture pattern, which is practically all web frameworks. < ++++++++++++&&&&&&&&&>(')*+,'-7/F&G&')*+,'-./0++++++ = ++++++++++++JBC;K&LM&45(4)*/-6)7*380/)781&,(,74/7-4)*/9& In MVC web frameworks, an application is modularized into !> ++++++++++++CBI"B&LM&?@2-67+>/& actions (Controllers) and pages (Views). A HTTP request !! ++++++++++++#N Translating the FROM clause with Ground, Scan and mally, χG1...Gn;eX operates on an input table that has at- Navigate operators The role of the ground (Ground) and tributes G1 ...Gn. It groups the input tuples into input sub- scan (Scan) operators is to provide the algebraic counterpart collections I1,...,Il according to the group-by attributes of the FROM clause. The ground operator has no input table G1 ...Gn, such that each input sub-collection Ii has an as- (indeed it is the only operator with no input table) and sociated group-by tuple (G1 : g1i ,...,Gn : gni ). For each always returns {[]}, i.e., a table that has a single tuple with input sub-collection Ii, the operator derives an output sub- no binding attribute. collection Oi by replacing the parameter input X with Ii in The scan operator Scans7→a inputs a table. For each bind- the expression eX. The operator’s result is ing tuple t of the input table, the scan finds the collection ∪i=1,...,l{(g1i , . . . , gni ) × Oi}, i.e., the union of the output 4 vc = {e1, . . . , en} identified by s, where either (a) s is a sub-collections after they are padded with their group-by named object id (e.g., customers, selected_nations), in values. For example: which case vc is the value of the named object, or (b) s is a χnation;Topk2(τsales(X)) nation name sales binding attribute of t and vc is its value. In either case, the nation name sales USA Joe 6 scan outputs binding tuples t#[a : e1], . . . , t#[a : en]. USA Joe 6 China Chen 5 = The navigate operator (Navs.p .....p 7→a) outputs exactly China Fu 8 1 m China Fu 8 China Chen 5 one binding tuple t#[a : v] for each input binding tuple t. China Zhao 4 The value v is non-null, when the binding tuple t has an Functions SQL++ allows functions over input attributes in attribute s whose value is a tuple t1 or an object reference the SELECT, FROM, WHERE, ORDER BY, GROUP BY, JOIN, PARTI- pointing to a tuple t1. In either case, the tuple t1 has an TION and LIMIT clauses. A flexible function can be executed attribute p1 whose value is a tuple t2 with an attribute p2. in both memory and database sources, whereas a source- The navigation continues until a tuple tm is reached and tm specific function can only be executed in one source. For has an attribute pm whose value is v. If there is no such path example, = is a flexible function, whereas db.date_part() of tuples t1, . . . , tm with attributes p1, . . . , pm, the value v is (Figure 6, Line 3) is a database-specific function. null. In the case that the value of s is a collection with a In the final distributed plan, each operator is designated single tuple (or single reference to a tuple), it is treated as to execute in a specific source. To handle the case where if it were a tuple or an object reference. a SELECT, WHERE, ORDER BY or LIMIT clause involves both For notational brevity, this paper also uses the derived op- memory-specific and database-specific functions, the SQL++ erator scan-navigate (ScNa) that combines a scan operator operators π ¯ , σc, τ ¯ and Topk are restricted from naming that produces a binding attribute a, with multiple navigate L F k grid functions in their parameters. Rather, the function operator grid operators that initiate their navigation from a. In particu-system system λf(A1,...,An)→N evaluates a single function f. For each input lar, ScNa 1 1 n n ≡ s7→a;a.p 7→a ,...,a.p 7→a grid tuple t with attributes A1,...,An, the λ operator outputs grid Nava.p17→a1 ... Nava.pn7→an Scans7→a. Furthermore, thesystem out- an output tuple that has all the input attributes, and an ad- system put attribute name can be omitted if it is identical togrid the ditional attribute N that is the result of evaluating f using grid input attribute name. system system the A1,...,An values in t. For example, the SQL++ clause Tuple-at-a-time nesting operator Nested queries aregrid trans- ORDER BY g(X),f(Y,Z) is translated into: grid lated into SQL++ algebra using the apply-plan operatorsystem system τN ,N (λf(Y,Z)→N (λg(X)→N )). grid 1 2 2 1 grid αP →N . The α operator inputs a single table. In general, system system P is a correlated plan, such that attributes of the input ta- grid grid p ! aggregates Plan p ble may appear in P wherever constants or named id’ssystem can ! system appear in uncorrelated queries. Such correlated input at- grid grid tributes are called parameters of P . The α operator outputssystem !nation_key, name ! order_year, sum_price system one tuple t#[N : P/t] for each input tuple t, i.e., each out- grid Topk grid system ! selected = true AND 3 system put tuple has all attributes of the respective input tuple t, nation_ref = nation_key and an additional attribute N for the result of evaluatinggrid ScNa grid system session ! sum_price system the plan P/t, i.e. the non-correlated plan that results from .selected_nations ! s; s.nation_ref, substituting each parameter with its respective bindinggrid in s.selected grid system !order_year; system SUM(total_price) sum_price the tuple t. ScNadb.nations ! n; ! grid n.nation_key, grid Nesting via grouping SQL++ allows aggregate functions n.name system !date_part('year', order_date) system that produce non-scalar results. Of particular interest to the Ground ! order_year grid grid optimizations presented in this paper is the system ! nation_ref = @nation_key AND system cust_ref = cust_key NEST(A1,...,At) → N function whose output is a new tablegrid grid system ScNadb.customers ! c; system N. Formally, the function is evaluated on each input sub- c.cust_key, c.nation_ref grid grid collection I1,...,Il created by γG1,...,Gn;NEST(A1,...,At)→N ac- ScNa system db.orders ! o; system cording to the group-by attributes G1 ...Gn. For each input o.cust_ref, o.order_date, o.total_price grid grid sub-collection Ii, the function NEST performs the projection system Ground system πA1,...,An (Ii) and outputs the table attribute N. grid grid Partition The partition operator χ enables support ofsystem the Figure 7: Initial plan from translation into algebra system SQL 2003 PARTITION feature and window functions. In ad- Figure 7 shows the initial algebraic plan translated from 4In case s identifies a non-collection value v and coercion is the SQL++ query of Figure 6. The Scansession.selected_nations enabled, the case is treated as if s had identified the collec- operator scans an in-memory Java collection object and out- tion {v}. puts a tuple with attributes nation_ref and selected. The grid grid system system α operator represents the SELECT clausegrid nested query: the !" grid !"#$%!"#%&'%()*% CS ! E X = V !…! X = V correlated plan p (within the dottedsystem box) corresponds to )% !"1 1 1 1 n n system Lines 3-11 of the SQL++ query. Noticegrid the use of the pa- !"#$%!"$%&'%()*% CS E grid system m ! m !V1...Vn ; system !"#$%%&%&'%(% ScNaT !t 0 NEST (O1...Oo,A1...Aa,N1...Nq ) rameter @nation_key in the correlated plan. According to T ! E t.X1!V1…t.Xn!Vn grid %'+,+-#%.%/012%+*% 0 !N grid the semantics of α, for each nation inputsystem tuple t, the cor- % system Ground !P !N '+,+-#%)% 1 1 related plan is evaluated with the parametergrid @nation_key !" grid /012%%&%% substituted by the corresponding valuesystem in t. The set of tu- !P !N system ,+/#%13#+0%41"5%(% q q ples resulting from the correlated plangrid evaluation becomes %'+,+-#%% grid system ! V1...Vn,O1...Oo, A1...Aa system the value of the new attribute aggregates. The function %%'#()('*6%% grid %%5789(+ )+ ,- )- ,/ )/ *%&'%/( grid date_part is evaluated in a λ operator. # & # . # 0 !V ...V ;Topk (" ) system %/012%%%(% 1 n n T1...Tt system grid %%'+,+-#%%% grid 4.2 Apply-plan Rewriter ! h system %%%'#()('*6% (( system (((+#()(+&,(( grid !V ...V G ...G ; grid grid (((1#234(-"(-#()(1.234(-"(-,(grid 1 n 1 k !P!N system A A system system system f1 (.)!A1... fa (.)!Aa (((5#(-"(/#()(50(-"(/0( grid grid grid %%/012%% 6%6( grid ! c system & system system E CS ! E %%!$+0+%7( system !" 1 1 grid !"#"$%&'& grid %%:013;%<=%'#)'*89#)9grid:( F grid systemCS ! E system system &&(& m m %%$&>"5:%;( system grid grid %%10?+0%<=%%#()(%<( grid grid &&&&)*%+&$!!&,!&('-.&& !P !N system 1 1 %%,"2"#%@% system ! &&&&'.&& system!" system grid %*%A8%B@@7C% grid &&&&)*%+&"# &,!&('-& grid ScNaT t grid system $ !Pq!Nq %:013;%<=%' ()(' system 0! system # *( t.X !V …t.X !V system &&&&!"#"$%&& *%&'%@7897D% 1 1 n n grid grid grid grid system %%%%%%&!%'%&()%% ! O1...Oo,A1...Aa 15%= >' (A@D%)%A@D%=system>' system # # * *( Ground system %%%%%%*!+,-%.#%.!%'%*/+,-%.#%./)%% grid Topkn grid system %%%%%%0!%.#%1!%'%02%.#%12% grid system grid &&&&/012&3% system Figure 9: Pattern output by apply-plan rewrite rule system grid grid ! T1...Tt system &&&&)+"0"&4% grid system grid &&&&30145&67&5!%'%56% system system grid ! h grid system &&&&+,8*93&7% Suppose attributessystem X1,...,Xn of expression E (i.e. the &&&&10:"0&67&8 %'%8 input of α) are the parameters of the correlated plan P . grid ! 9% ! grid system &&&*2*%&:% G1...Gk ; f1 (.)!A1... fa (.)!Aa The rewriter replacessystem the α with the plan of Figure 9, which 5 grid %%-&,!&9& intuitively proceedsgrid in the following steps. ! c system /012&;% First the input systemE of the α is evaluated and assigned to a grid grid system F temporary T . Insystem the running example, this means that na- tion keys and names are computed. See in Figure 10, which grid Figure 8: Pattern for Set-Processable Queries and Plans grid system shows the rewrittensystem plan of Figure 7, that the temporary T1 grid grid system The apply-plan rewriter inputs a plan (eg, see the pattern contains the nationsystem keys and names. Notice that the nation grid of Figure 8) and replaces tuple-at-a-time α operators with key would have togrid be computed even if it were not included system set-at-a-time operators, thereby producing plans as the one in the SELECT clausesystem of the outer query. The reason is that grid shown in Figure 9. the nation key is necessarilygrid an input of α since it is a pa- system system The following case analysis proceeds in the following steps rameter of its nested plan. of increasing complexity: Second, the subplan of the right hand side of the outerjoin of Figure 9 is computed. This subplan has one output tuple 1. We first present the special case where the FROM clause t = [X1 : v1,...,Xn : vn,N : P/(X1 : v1,...,Xn : vn) has no join or outerjoin expressions (called expression- for each distinct tuple of parameters v1, . . . , vn. This tuple free FROM case) and the plans of the α operators also contains a binding attribute N, which carries the result either do not involve or any aggregation or they in- P/(X1 : v1,...,Xn : vn) , i.e., the result of the execution volve both a group-by clause and aggregation (called of the nested plan P when its parameters are instantiated no-total-aggregation case). The analysis will reveal the to v1, . . . , vn. IN the running example, the right hand side potential requirements placed on the object references subplan has attributes nation_key and aggregates. and object id’s of the data. Notice that the subplan has emerged from rewriting the ground of the original nested plan with a duplicate-eliminating 2. Then we extend to FROM clauses that may involve ex- (δ) projection of the parameters in E. Conceptually, this pressions. This case encompasses the expression-free- guarantees that the subplan will produce all combinations FROM case. of parameters with tuples from the FROM clause of the nested plan P . (Again, we stress that one should think of this ex- 3. Finally, we present the total aggregation case where we planaion purely “conceptually”, since subsequent optimiza- remove α operators whose plans involve aggregation tions will introduce various optimization efficiencies.) without grouping. For example, a total aggregation A number of issues, pertaining to aggregation and top-k, case would be a modification of the running example have to be taken care of by the rewriting. In particular, if where the nested query involves no grouping and re- there were any grouping (γ) in the original plan, then its turns the total sum of sales for its nation parameter. grouping attributes have to also include the parameters. If The total aggregation case has a fundamentally differ- there were any ordering (τ) and top-K (Topk) they have ent rewriting from the no-total-aggregation case. 5Notice that the plan does not yet describe the fine details of 4.3 Expression-free FROM with no total ag- its execution, neither is optimized for execution yet. These gregation optimizations and details will be added later. to be replaced by a partition (χ) operator. In the running parameteri v”, whereas the variable v ranges over the mem- example, the top-3 sales dates computation is achieved by bers of a collection that is provided as a parameter from the partitioning the input by the nation_key parameter and outer level. The algebraic counterpart will be a plan of the then keeping the top-3 of each partition. form αP 7→N E, where the nested plan P is Scanc7→v Ground Finally, the natural left outer-join ./ combines by join- and the outer plan E produces a binding attribute c. Note ing on the parameters, the temporary T (which intuitively that we deliberately focus on the scan corresponding to the carries the result of the outer query, along with parameters) nested query’s FROM. It will be obvious how the point below with the right hand side subplan (which carries the result applies even in the presence of selections, groupings, joins, of the inner query, also along with parameters). Simplifica- additional scans etc. tions of the rewrite rule are easy to produce for when one The elimination of α in this example requires that the pa- or more of the assignments, π, Topk, τ, σ or γ operators are rameter c appears in the group-by list. From an implemen- absent. tation point of view, this is efficient (and therefore worthy grid In the case that the plan P of the α contained any as- of reducing tuple-at-a-time to set-at-a-time)grid only if c is a system signments, these assignments would simply be set-processed directly or indirectly identifiable object.system We call it directly grid (by application of the exact same rewritings that we use to identifiable if c is an object. We call it indirectlygrid identifiable system system rewrite the result expression) and would be computed before if it is only the collection value of an attribute of a tuple grid grid system the computation of the outerjoin. that is an object (i.e., identifiable). Insystem the indirect case, the tuple’s id can effectively operate as a proxy for the missing grid grid system T1 Main plan id of c itself. system grid In either case, the source has to eithergrid explicitly support system !nation_key, name the export of id’s to Collage, so that thesystem grouping according grid to c is executed at Collage, or, the sourcegrid needs to be able to system ScNa !nation_key; system ! selected = true AND T1 ! t; efficiently group according to object references/id’s, so that nation_ref = nation_key t.nation_key, NEST(order_year, sum_price) grid t.name ! aggregates grid ScNa Collage can defer to the source. The Java runtime is such a system session Ground !nation_key, system .selected_nations ! s; order_year, sum_price source, in the sense that it enables an implementation where grid s.nation_ref, grid s.selected system !nation_key;Topk3 ("sum_price ) Collage can group by object references.system ScNadb.nations ! n; As an example of the above case consider the following grid n.nation_key, grid system n.name !nation_key, order_year; query system SUM(total_price) ! sum_price grid Ground grid system !date_part('year', order_date) SELECT c.name, (SELECT n.* FROM c.nationssystem n) FROM session.countries c ! order_year grid grid system ! nation_ref = nation_key AND system cust_ref = cust_key This query is set processable only if c.nations is an object grid grid ScNadb.customers ! c; (direct case) or c is an object (indirect case). system c.cust_key, c.nation_ref system grid grid ScNadb.orders ! o; 4.4 The total aggregation case system o.cust_ref, o.order_date, system o.total_price An extra challenge is introduced by nested queries that grid grid system ! feature aggregation but do not featuresystem grouping. For exam- grid ple, consider the following query that computesgrid the sum of ScNaT1 ! t; system t.nation_key sales per customer. system grid Ground grid system SELECT cust_key, ( system SELECT SUM(o.total_price) AS sum_price Figure 10: Plan after apply-plan rewriting FROM db.orders o We call this type of plan a normalized-sets plan, i.e. for WHERE c.cust_key = o.cust_ref γG1,...,Gn;NEST, the grouping attributes includes only output FROM db.customers c attributes from E that are in correlated attributes X1,...,Xn. For example in Figure 10, the γNEST has only grouping at- Then consider a customer c that has made no order yet. tribute nation_key which corresponds to the correlated at- By following the provided α-removal rule we derive a plan tribute, whereas the non-correlated name is deferred until af- where the sum of sales is null, as opposed to the correct, ter the ./. Section 4.8 explains why these plans are superior which is zero. In such cases where there are no grouping than alternate denormalized-sets plans where the grouping attributes, any null that appears in lieu of the value of the attributes of γNEST also includes non-correlated attributes, aggregate f should be replaced with the result of this func- i.e. the input of γNEST is denormalized with respect to the tion f when it is given an empty group We omit the algebraic correlated attributes. specifics of performing this change of value. When there are more than one α operator in a plan, the rewriter eliminates them by repeated application of the 4.5 Unrestricted FROM clause rewrite rule in a top-down manner: an α that appears in a Next we consider FROM clauses that contain join and/or nested plan is eliminated after its parent α has been elimi- outerjoin expressions. While joins can be rewritten away, nated. outerjoins cannot be rewritten away and therefore we will Requirements of set processability on object id’s and focus on them. We will first consider the case where the references Under certain circumstances described next, the FROM clause of the nested query is solely a single outerjoin. reduction of tuple-at-a-time processing to set-at-a-time pro- In particular, consider the expression αl./cr7→N E, where the cessing (set processability) is dependent on the availability left hand side l of the outerjoin involves parametersx ¯l and of object references and object-id’s at the source data. In the right hand side r of the outerjoin involves parameters particular, consider a nested FROM clause “FROM hcollection x¯r. If the right hand side r is uncorrelated, i.e.,x ¯r = ∅, ing whether it is executed in-memory or in-database. then the α removal proceeds similarly to the expression- The distributor is tuned for characteristics common to free FROM case. Namely, the temporary T = δπbarxl,x¯r E most web applications: persistent data in the database data is computed first. Then the ground of l is replaced with are orders of magnitude larger than both (1) the transient T (the resulting expression denoted as l(T )) and the usual memory data input by the SQL++ query, and (2) the output combination of γNEST and outerjoin accomplishes the required of the SQL++ query as displayed on the page. Therefore, nesting. In summary, the rewritten plan is the source decision of an operator is determined through bottom-up sub-plan maximization as follows: (i) If an oper- ator is source-specific, it is decided to be executed in that T 7→ δπbarx ,x¯ E; l r source. (ii) If an n-ary operator (i.e. join, ∪ and ∩, which are E ./γ ¯ (l(T ) ./c r) x¯l;NEST(O) all flexible) has children distributed across both sources, the where O is the list of binding attributes of l and r. operator is decided to be in-database, and the in-memory If the right hand side r is also correlated (i.e.,x ¯r 6= ∅) then inputs are copied to the database. This is because copying the ground of r is also replaced with T . Note however that in in-memory inputs to the database is much faster than the order to avoid cardinality errors in the result, a tuple of l(T ) opposite. (iii) Otherwise, the operator is flexible and has should match a tuple of r(T ) only if they are derived from children within a single source, therefore the operator is de- the same tuple of T . Technically this intuition is captured by cided to be in the same source as its children, in order to changing accordingly the condition of the ./. In particular, recursively maximize the size of sub-plans. 0 ¯0 let us denote by r the plan r/(x ¯r 7→ xr), i.e., the plan Figure 11 shows the plan after the distributor has deter- that results from the substitution of the parametersx ¯r with mined source decisions, as denoted with db or mem. Cuts (i.e. 0 parameters with fresh namesx ¯r . Then the needed rewriting data transfers) between a pair of adjacent in-memory and in- is database operators are denoted with dotted perpendicular lines. T 7→ δπbarxl,x¯r E; 0 4.7 Plan-to-SQL Translation E ./γ ¯ (l(T ) ./ ¯0 r (T )) grid x¯l;NEST(O) x¯r =xr ∧c grid system system It is straightforward to see how the above generalizes to T db Pmem grid grid i ! i system the case where the plan of α were not simply an outerjoin system db mem T j ! Pj grid but also included grouping, top-k etc. grid system system Pmem Pmem 4.6 Distributor k k grid grid mem system systemSe ndQuery Pdb grid db T1 grid system Main plan system grid db mem grid system !nation_key, name system P!db db mem db mem P nation_ref = j grid mem!nation_key; Pi grid system nation_key ScNaT1 ! t; NEST(order_year, sum_price) system db t.nation_key, ! aggregates mem grid ScNadb.nations db t.name grid db system ! n; system ! selected n.nation_key, Ground nation_key, Figure 12: Translating Maximal Sub-plans to SQL = true ! n.name order_year, sum_price grid mem db db Figure 12 illustrates how the plan-to-SQLgrid translator trans- system Ground !nation_key;Topk (" ) system ScNasession 3 sum_price grid .selected_nations ! s; lates maximal sub-plans to SQL. Givengrid source decisions on s.nation_ref, db system mem s.selected !nation_key, order_year; each operator, the plan-to-SQL translatorsystem performs the fol- SUM(total_price) ! sum_price db grid Ground db lowing recursively: Each maximal in-databasegrid sub-plan P mem system !date_part('year', order_date) is replaced by a SendQuery operator,system where S is the SQL ! order_year S db db grid nation_ref = statement translated from P . As agrid special case, when system nation_key mem system db SendQueryS is the root operator of e in an T ← e, a tem- grid db grid cust_key = db porary table T is created directly from the results of S with- system cust_ref system ! out any data transfer. Each maximal in-memory sub-plan grid db db db mem db grid db mem system ScNa Q that is an input to P is movedsystem to a new U ← Q , ScNadb.customers ScNadb.orders ! o; T1 ! t; t.nation_key grid ! c; o.cust_ref, which creates a temporary table U in thegrid database and bulk c.cust_key, o.order_date, system c.nation_ref o.total_price copies the result of evaluating Q intosystemU. U is subsequently db db db grid utilized by S. The final plan of the runninggrid example as system Ground Ground Ground output by the plan-to-SQL translator issystem shown in Figure 13. Figure 11: Plan after distributor’s source decisions 4.8 Alternate Denormalized-sets Plan The distributor decides the source where each operator To illustrate an alternate rewrite rule for replacing α op- will be executed, with the objective of outputting efficient erators, consider the denormalized-sets plan in Figure 14, physical plans that limit the number of tuples transferred which is Figure 7 rewritten with the alternate rule, and aug- between sources. Operators are either source-specific or mented with source decisions for ease of exposition. The flexible. A source-specific operator is: (1) a ScNa (2) an α operator is replaced, but with key differences from the α (3) a λf where f is a source-specific function, or (4) normalized-sets plan of Figure 10. As quantified by ex- a γG¯;f1→N1,...,fm→Nm where one or more of f1 . . . fm are periments in Section 5.2, this denormalized-sets plan has source-specific. All other operators are flexible. For each two inherent performance limitations as compared to the operator, the distributor provides a source decision indicat- normalized-sets plan. grid grid system system grid grid system system grid grid system system db T2 Main plan additional tuples will similarly increase subsequent process- grid grid system mem mem ing and data transfer costs. In contrast,system the normalized-sets ! selected = true grid mem mem plan avoids these redundancies by addinggridScan and δ opera- system mem system ScNasession SendQuery !nation_key; tors (which is reminiscent of the classical semi-join technique .selected_nations ! s; NEST(order_year, sum_price) grid s.nation_ref, !"#"$%&'()*+',-./0&& ! aggregates in distributed query processing, and deferringgrid ./ till after all system s.selected &&&&&&&'(5.& system other operators. mem 89:;&&&%Q& mem grid grid Ground SendQuery Second, recall that ./ is not a commutative operator, system and thus restricts join reordering optimizationssystem 6. In the grid !"#"$%&&'()*+',-./0&+12.1,/.(10&345,61*7.& grid db denormalized-sets plans, ./ is decided to be in-database, system T1 89:;&&&&<& system &!"#"$%&,=>?0& thus the resulting SQL query specifies (T3 ./ nations) ./ grid !"#"$%&'()*+',-./0&&& &&&&&&&&1+@,'45A.1&:B"9&<& grid system &&&&&&&'(5.&& &&&&&&&&&&CD9%E%E:F&GH&,=>'()*+',-./& (customers ./ orders), where T3 (notsystem shown in figure) is a 89:;&&&'()*+'3&D!&'& &&&&&&&&&&:9I"9&GH&,=>345,61*7.& grid temporary table bulk loaded with session.selected_nationsgrid N:EF&&&%=>'()*+',1.P&O& &&&&&&&&J&D!&1+@,*2& system system &&&&&&&'>'()*+',-./& &89:;&&&<& tuples. Due to ./, no join reordering occurs, thus the database grid &&!"#"$%&&&'()*+',-./0& first executes the expensive join betweengrid customers and system &&&&&&&&&&&2().,6(1) First, observe that for αP →N (E) where E has correlated 5.1 Tuple-at-a-time vs Set-at-a-time plans attributes X1,...,Xn, there is a functional dependency from In Figure 15, the number of selected nations k is varied X1,...,Xn to the attributes of N. For example, from 5 to 25 to show the speedup of set-at-a-time versus nation_key → order_year,sum_price. In the denormalized- tuple-at-a-time semantics at different sizes of results dis- sets plan, ./ denormalizes intermediate results by joining played on pages. Due to their tuple-at-a-time semantics, on X ,...,X , thus introducing more non-correlated at- 1 n both RoR and T issue k queries, i.e. one query per selected tributes and/or tuples from its LHS. This creates two types nation. The join between customers and orders, which is of performance penalties in Figure 14. (1) Horizontal redun- an expensive operation in each query since the low join se- dancy: ./ outputs wider tuples due to the additional name lectivity causes Postgresql to scan both tables sequentially, attribute, which increases processing costs of the γSUM, χ, π is evaluated k times in total. Not only do the running times and γNEST operators above, as well as the data transfer cost across the cut. (2) Vertical redundancy: ./ outputs more 6While database researchers have shown that rewriting ./ tuples than its RHS. If the correlation attributes were other and other join operators to a “generalized-join” operator can than nation_key and have duplicate values in the LHS, each enable further join reordering [12], we did not find a RDBMS RHS tuple will be copied once per duplicate value, and the that implements this optimization. ! "# "! $# $! $%&'() !"# $"! %&"' %(") %*"( *++,-%,'. &&"! )*"( #!"! ''"* %+#"' /0/ &)"( )!"# !)"& $'"% %%#"( 1+))23+ 45167(6/0/8 ("& )"$ *"' !"( !"# %)+, due to the added partition χ operator. However, at the low %&+, selectivities of fanouts 4,500 and 22,500, RoR and T incurs %++, sequential scans of customers and index scans of orders, '+, whereas NS incurs sequential scans of both customers and 9:;)64()<0.2(86 #+, orders. This is a classical example of when low selectivity )+, leads to sequential scans being more efficient than multiple &+, index lookups, thus NS shows a speedup of 4.5-6.3x. At the +, *, %+, %*, &+, &*, lowest selectivity of fanout 112,500, all plans incur sequential -.-, &)"(, )!"#, !)"&, $'"%, %%#"(, scans for both tables, thus the reasons for speedup is iden- /, &&"!, )*"(, #!"!, ''"*, %+#"', tical to that of Figure 15. These data points illustrate that 01, !"#, $"!, %&"', %("), %*"(, when databases increase in size, the consequent increase in 1233452, ("&, )"$, *"', !"(, !"#, fanouts (and hence lowering of selectivities) in aggregation 601,78,-.-9, =>653;?)@60A6(),) !"* $"( )"* &"# Table 1 compares the running times between DS and NS. Two parameters are varied: (a) the width of a tuple in the $!+ nation table along a logarithmic scale, by changing the char- *!+ acter length of the name column (b) the fanout of customers )!+ per nation, by using the identical methodology and fanout values as in Section 5.1. The number of selected nations (!+ is also fixed at 10. Both parameters respectively highlight &!+ the two performance limitations of DS as explained in Sec- ?$@3<;23A%&:2>< %!+ tion 4.8. +++++++++++++++,,-./0,*!"12-3415.6+78+12-3415.69+ !+ From top-to-bottom for increasing tuple widths, the run- P!!+ )*!!+ &&9*!!+ %%&9*!!+ +++++++++++++++,,-./0,*%"4,4:;.:;2-.+78+4,4:;.:;2-.9+ +++++++++++++++8<=>7??+,,-./0,*%"4,-4-2@0:3A.B+78+CD/,A2@@,,Ening time of increases proportionally to the tuple width, G4G+ !")+ (#+ *!+ ()+ DS ++++++++++++FGH=whereas NS is unaffected. This illustrates the horizontal re- K+ !"'+ ((+ )(+ &)+ ++++++++++++++++++> M8+ !"#+ $+ %%+ %&+ +++++++++++++++++++++8I?IJK+7??dundancy experienced in DS, and the effectiveness of the 80..;D0+ ++++++++++++++++++++++++,,02:2/,("1,12-3415.6+78+12-3415.6Scan and δ operators in NS for avoiding these redundan- !"*+ $"(+ )"*+ &"#+ >M8+EC+G4GB+ +++++++++++++++++++++FGH=cies. Note that vertical redundancy is not demonstrated ,"&%-#B From left-to-right for increasing fanouts (i.e. lower nation- ++++++++++++++++++++++++8I?IJK+7?? Figure 16 shows how different join selectivities in the database+++++++++++++++++++++++++++4"4,ADC-5.6+78+4,ADC-5.69+customer selectivities), the running times for both DS and affects speedup. To vary join selectivities, different TPC-H+++++++++++++++++++++++++++4"4,-4-2@0:3A.+78+4,-4-2@0:3A.9+NS increase. Recall that DS executes (T3 ./ nations) ./ +++++++++++++++++++++++++++4"4,4:;.:;2-.+78+4,4:;.:;2-.(customers ./ orders) without the benefit of any join re- database instances are created by changing the total num-++++++++++++++++++++++++FGH= ber of nations from 25 to 500, 100, 20 and 4 respectively.+++++++++++++++++++++++++++0DO@3A"4:;.:C+78+4ordering. With a lower selectivity, the ./ on the large in- The total number of customers and orders remain the same,+++++++++++++++++++++B+78+,,-./0,*%termediate result customers ./ orders becomes more ex- +++++++++++++++++++++LMMIG+NHLM+pensive, thus DS increases in running time. On the other but customers are evenly re-distributed among the avail-+++++++++++++++++++++> able nations. This produces a logarithmic scale of nation-++++++++++++++++++++++++8I?IJK+7??hand, NS executes (T1 ./ customers) ./ orders after join to-customer fanouts: 900, 4,500, 22,500 and 112,500. The+++++++++++++++++++++++++++A"A,ADC-5.6+78+A,ADC-5.69+reordering, thus it is affected to a lesser extent by the lower +++++++++++++++++++++++++++A"A,12-3415.6+78+A,12-3415.6selectivity since customers is an order of magnitude smaller number of selected nations is fixed at 10, except in the++++++++++++++++++++++++FGH= last fanout where only 4 nations can be displayed. Higher+++++++++++++++++++++++++++0DO@3A"ADC-4/.:C,*!!+78+Athan customers ./ orders. nation-to-customer fanout corresponds to lower selectivity These data points illustrate that when tables become wider for nation ./ customers. and fanouts increase, NS will have increased benefits over At the high selectivity of fanout 900, PostgreSQL uses DS, thus providing empirical justification for Collage’s apply- index lookups against both customers and orders for all plan rewriting to output NS plans. of RoR, T and NS. In this case, NS does not increase perfor- mance over RoR, and in fact incurs a 0.1s penalty over T 6. RELATED WORK database server. Database and programming language research work has investigated the problem of efficient and/or single-language 7. REFERENCES access, both before and after the advent of ORMs. Single access via semistructured query languages Strudel [11][1] M. O. Akinde and M. H. B¨ohlen. Efficient showed that content-publishing web pages can be conve- computation of subqueries in complex olap. In ICDE, niently specified using a semistructured XML query lan- pages 163–174, 2003. guage. Later, XPERANTO [21] and SilkRoute [10] ad- [2] Anonymized due to double-blind requirements. dressed the optimization of XML queries that produce nested [3] Anonymized due to double-blind requirements. results by accessing an SQL database. The semistructured [4] Anonymized due to double-blind requirements. query is executed by emitting one or more SQL queries to [5] A. Badia. Computing SQL queries with boolean the database and consequently combining the results. aggregates. In Data Warehousing and Knowledge Unlike Collage, they do not consider analytics queries (i.e., Discovery, pages 391–400. Springer, 2003. queries with group-by, top-k), memory-database distribu- [6] B. Cao and A. Badia. A nested relational approach to tion or navigations into external objects (and the compli- processing SQL subqueries. In Proceedings of the 2005 cations they introduce on the reduction of tuple-at-atime ACM SIGMOD international conference on to set-at-a-time). Due to the limited (with respect to Col- Management of data, pages 191–202. ACM, 2005. lage) scope of their queries and the respective limited scope [7] A. Cheung, O. Arden, S. M. A. Solar-Lezama, and of their optimization decisions, they do not need Collage’s A. C. Myers. Statusquo: Making familiar abstractions algebra rewriting-based query processing engine. perform using program analysis. CIDR, 2013. In the context of non-distributed select-join nested queries [8] W. R. Cook and A. H. Ibrahim. Integrating (i.e., nested queries over just an SQL database, group-by and programming languages and databases: What is the top-k) XPERANTO, Silkroute and Collage still have differ- problem. In In ODBMS.ORG, Expert Article, 2005. ences. XPERANTO produces an SQL query q for each col- i [9] U. Dayal. Of nests and trees: A unified approach to lection ci of the nested query (be it the top level connection or a nested collection) by left outer-joining all the FROM- processing queries that contain nested subqueries, aggregates, and quantifiers. In VLDB, pages 197–208, WHERE clauses on the path to ci. Unlike Collage, each sub-query inefficiently uses the entire table produced for the 1987. parent collection although only the correlating attributes are [10] M. Fernandez, A. Morishima, and D. Suciu. Efficient required. evaluation of XML middle-ware queries. In ACM Tuple-at-a-time to set-at-a-time reduction in SQL SIGMOD Record, volume 30, pages 103–114. ACM, WHERE 2001. Many research efforts have been devoted to the sub-query [11] M. F. Fern´andez, D. Florescu, A. Y. Levy, and decorrelation in the WHERE clause since the 1980’s. Kim [19] D. Suciu. Declarative specification of web sites with first developed the query transformation algorithms to rewrite Strudel. VLDB J., 9(1):38–55, 2000. nested queries into equivalent, flat queries which can be [12] C. Galindo-Legaria and A. Rosenthal. Outerjoin processed more efficiently. Dayal [9], Muralikrishna [20], simplification and reordering for query optimization. Ganski-Wong [14] extended the work to support subqueries ACM Transactions on Database Systems (TODS), with aggregates and more than one level of nesting. 22(1):43–74, 1997. Galindo-Legaria and Joshi [13] represents the subqueries [13] C. A. Galindo-Legaria and M. Joshi. Orthogonal in SQL Server using apply operator (as Collage does), and optimization of subqueries and aggregation. In then removing the correlations by transforming them into SIGMOD Conference, pages 571–581, 2001. joins,semi-joins, outer-joins, etc., according to a set of rules [14] R. A. Ganski and H. K. T. Wong. Optimization of built upon. Other works [1, 6, 5] consider the prob- nested SQL queries revisited. In SIGMOD Conference, lem when correlated subqueries occur within positive and pages 23–33, 1987. negative existential (ANY, EXISTS, IN) or universal (ALL) [15] H. Garcia-Molina, J. D. Ullman, and J. Widom. quantification. The techniques all involve extending the Database Systems: The Complete Book, chapter 5, standard relational algebra. We consider them as comple- pages 205–241. Prentice-Hall, 2009. mentary to work ours. [16] T. Grust. Purely relational flwors. In XIME-P, Optimization of database-accessing application code volume 37, 2005. A second line of research adopts a conventional architecture, [17] T. Grust and M. Mayr. A deep embedding of queries where code written in a conventional application program- into Ruby. In ICDE Conference, 2012. ming language (e.g., Java) issues SQL statements directly [18] T. Grust, M. Mayr, J. Rittinger, and T. Schreiber. or indirectly through the use of an ORM. These works ana- Ferry: database-supported program execution. In lyze the code and, consequently, they generally rewrite the SIGMOD Conference, pages 1063–1066, 2009. application code to improve data access performance. [19] W. Kim. On optimizing an SQL-like nested query. Loop lifting [16] is a technique originally developed to ACM Trans. Database Syst., 7(3):443–469, 1982. handle XQuery FLOWR construct using relational algebra plans. Later, it was generalized and extended to the FERRY [20] M. Muralikrishna. Optimization and dataflow compilation framework [18], with a richer data model and algorithms for nested tree queries. In Proc. Int. Conf. more language constructs to handle queries in more general on Very Large Data Bases (VLDB), volume 516, 1989. programming languages. [21] J. Shanmugasundaram, E. Shekita, R. Barr, M. Carey, Finally, StatusQuo [7] automatically partitions the ap- B. Lindsay, H. Pirahesh, and B. Reinwald. Efficiently plication computation between the application server and publishing relational data as XML documents. The VLDB Journal, 10:133–154, 2001. [22] Tpc-h homepage. http://www.tpc.org/tpch/. [23] Wikipedia. Template processor, 2013. Accessed Aug 16 2013. http://en.wikipedia.org/w/index.php? title=Template_processor&oldid=567590946. [24] F. Yang, J. Shanmugasundaram, M. Riedewald, and J. Gehrke. Hilda: A high-level language for data-drivenweb applications. In ICDE, page 32, 2006.