Symmetric and Asymmetric Aggregate Function in Massively Parallel Computing

ZHANG Chao Supervised by Prof. Farouk Toumani, Prof. Emmanuel GANGLER

LIMOS, Universite´ Clermont Auvergne, Aubiere,` France {zhangch, ftoumani}@isima.fr

ABSTRACT When an arbitrary aggregation function is decomposable, Applications of aggregation for information summary have how to decompose it and when a decomposition is ’efficient’ great meanings in various fields. In big data era, processing is a hard nut to crack. Previous works identify interesting aggregate function in parallel is drawing researchers’ atten- properties for decomposing aggregation. A very relevant tion. The aim of our work is to propose a generic framework classification of aggregation functions, introduced in [11], is enabling to map an arbitrary aggregation into a generic al- based on the size of sub-aggregation (i.e., partial aggrega- gorithm and identify when it can be efficiently executed on tion). This classification distinguishes between distributive modern large-scale data-processing systems. We describe and algebraic aggregation having sub-aggregation with fixed our preliminary results regarding classes of symmetric and sizes, and holistic functions where there is no constant bound asymmetric aggregation that can be mapped, in a system- on the storage size of sub-aggregation. Some algebraic prop- atic way, into efficient MapReduce-style algorithms. erties, such as associativity and commutativity, are identi- fied as sufficient conditions for decomposing aggregation [17, 3]. Compared to these works, our work provides a generic 1. INTRODUCTION framework to identify the decomposability of any symmetric The ability to summarize information is drawing increas- aggregation and generate generic algorithms to process it in ing attention for information analysis [11, 6]. Simultane- parallel. Moreover, all but few researches in the literature ously, under the progress of data explosive growth process- consider symmetric functions. Asymmetric aggregation is ing aggregate function has to experience a transition to inherently non-commutative functions and this makes their massively distributed and parallel platforms, e.g. Hadoop processing in parallel and distributed environment far from MapReduce, Spark, Flink etc. Therefore aggregation func- being easy. In [16], a symbolic parallel engine (SYMPLE) is tion requires a decomposition approach in order to execute proposed in order to automatically parallelize User Defined in parallel due to its inherent property of taking several val- Aggregations (UDAs) that are not necessarily commutative. ues as input and generating a single value based on certain Although interesting, the proposed framework lacks guaran- criteria. Decomposable aggregation function can be pro- tees for efficiency and accuracy in the sense that it is up to cessed in a way that computing partial aggregation and then users to encode a function as SYMPLE UDA. Moreover, merging them at last to obtain final results. symbolic execution may have path explosion problem. Decomposition of aggregation function is a long-standing My research focuses on designing generic framework that research problem due to its benefits in various fields. In enables to map symmetric and asymmetric aggregation func- distributed computing platforms, decomposability of aggre- tions into efficient massively parallel algorithms. To achieve gate function can push aggregation before shuffle phase [17, this goal, we firstly identify a computation model, and an 3]. This is usually called initial reduce, with which the associated cost model to design and evaluate parallel algo- size of data transmission on a network can be substantially rithms. We consider MapReduce-style (MR) framework and reduced. For wireless sensor network, the need to reduce use the MRC [12] cost model to define ’efficient’ MR algo- data transmission is more necessary because of limitation of rithms. We rest on the notion of well-formed aggregation [4] power supply [15]. In online analytical processing (OLAP), as a canonical form to write symmetric aggregation and pro- decomposability of aggregate function enables aggregation vide a simple and systematic way to map well-formed aggre- across multi-dimensions, such that aggregate queries can be gation function α into an MR algorithm, noted by MR(α). executed on pre-computation results instead of base data to Moreover, we provide reducible properties to identify when accelerate query answering [8]. An important point of query the generated MR(α) is efficient (when MR(α) is an MRC optimization in relational is to reduce table size algorithm). Then we extend our framework to a class of for join [10], and decomposable aggregation brings interests asymmetric aggregation function, position-based aggrega- [4]. tion, and propose extractable property to have generic MRC algorithms. Our main results are Theorem 1 and Theorem 2, of which proofs are provided in an extended report[2].

Proceedings of the VLDB 2017 PhD Workshop, August 28, 2017. Munich, 2. MRC ALGORITHM Germany. Copyright (C) 2017 for this paper by its authors. Copying permitted for Several research works concentrate on the complexity of private and academic purposes. parallel algorithms. MUD[7] algorithm was proposed to Table 1: MR(α): a generic MR aggregation algo- rithm operation P mapper ⊕,d ∈X F (dj ) Pj i reducer T ( ⊕,i oi)

on the associative and commutative property of ⊕: pro- cessing F and ⊕ at mapper, ⊕ and T at reducer. Table 1 depicts the corresponding generic MapReduce(MR) algo- Figure 1: MapReduce flowchart with MRC con- rithm(the case of one key and trivially extending to any number of keys), noted by MR(α), where mapper input is straints P a submultiset Xi of X and mapper output is oi, and ⊕ is the concatenation of ⊕. transform symmetric streaming algorithms to parallel algo- However, the obtained MR(α) are not necessarily an effi- rithms with nice bounds in terms of communication and cient MapReduce algorithm. We identify when MR(α) is a space complexity, but without any bound on time complex- MRC algorithm using reducibility property. ity. This disqualifies MUD as a possible candidate cost model to be used in our context. MRC[12] is another popu- Definition 1. A symmetric aggregation function α defined lar model that has been used to evaluate whether a MapeRe- on domain I is reducible if the well-formed aggregation (F, ⊕ duce algorithm is efficient. The constraints enforced by ,T ) of α satisfies

MRC w.r.t. total input data size can be summarized as ∀di, dj ∈ I : |F (di) ⊕ F (dj )|= O(1). following: sublinear number of total computing nodes, sub- linear space for any mapper or reducer, polynomial time for With this reducible property, we provide a theorem iden- any mapper or reducer, and logarithm round number. We tifying when MR(α) of a symmetric aggregation is a MRC illustrate these constraints besides round number in a sim- algorithm. plified MapReduce flowchart in figure 1 where  > 0. Hence, the MRC model considers necessary parameters Theorem 1. Let α be a symmetric well-formed aggrega- for parallel computing, communication time, computation tion and MR(α) be the generic algorithm for α, then MR(α) space and computing time, and makes more realistic as- is an MRC algorithm if and only if α is reducible. sumptions. A MapReduce algorithm satisfying these con- 3.2 Deriving MRC Algorithm from Algebraic straints is considered as an efficient parallel algorithm and Properties will be called hereafter an MRC algorithm. In this section, we investigate several symmetric aggre- gation properties satisfying Theorem 1. If an aggregation 3. SYMMETRIC AGGREGATIONWITH MRC α is in one of the following classes, then α has an MRC(α) Let I be a doamin, an n-ary aggregation α is a function[9]: algorithm illustrated in table 1. In → I. α is symmetric or commutative[9] if α(X) = An aggregate function α is associative [9] if for multi- α(σ(X)) for any X ∈ I and any permutation σ, where set X = X1 ∪ X2, α(X) = α (α(X1), α(X2)) . Associative σ(X) = (xσ(1), ..., xσ(n)). Symmetric aggregation result does and symmetric aggregation function can be transformed not depend on the order of input data, therefore input is in well-formed aggregation (F, ⊕,T ) as following, considered as a multiset. In this section, we define a generic framework to map symmetric aggregation into an MRC al- F = α, , ⊕ = α, T = id (1) gorithm. where id denotes identity function. α is reducible because 3.1 A Generic Form for Symmetric Aggrega- it is an aggregation. Therefore MR(α) of associative and symmetric aggregation α is an MRC algorithm. tion An aggregation α is distributive [11] if there exists a com- To define our generic aggregation framework, we rest on bining function C such that α(X,Y ) = C(α(X), α(Y )). Dis- the notion of well-formed aggregation [4]. A symmetric ag- tributive and symmetric aggregation can be rewritten in gregation α defined on a multiset X = {d1, . . . , dn} can be well-formed aggregation (F, ⊕,T ) as following, written in well-formed aggregation as following: F = α, ⊕ = C,T = id. (2) α(X) = T (F (d1) ⊕ ... ⊕ F (dn)), Similarly, α is reducible and corresponding MR(α) is an where F is translating function(tuple at a time), ⊕ is a com- MRC algorithm. mutative and associative binary operation, and T is termi- Another kind of aggregate function having the same be- nating function. For instance, average can be easily trans- havior as symmetric and distributive aggregation is com- formed into well-formed aggregation: F (d) = (d, 1), (d, k)⊕ mutative semigroup aggregate function [5]. An aggre- 0 0 0 0 d gation α is in this class if there exists a commutative semi- (d , k ) = (d + d , k + k ) and T ((d, n)) = . In fact, any N n group (H, ⊗), such that α(X) = α(xi). The corre- symmetric aggregation can be rewritten into well-formed ag- xi∈X sponding well-formed aggregation (F, ⊕,T ) is illustrated as gregation with a flexible choice of ⊕, e.g ⊕ = ∪. following, Well-formed aggregation provides a generic plan for pro- cessing aggregate function in distributed architecture based F = α, ⊕ = ⊗,T = id. (3) It is clearly that α is reducible and MR(α) is an MRC Asymmetric well-formed aggregation can rewrite any asym- algorithm. metric aggregation α, and with the associative property of A more general property than commutative semi-group ⊕, α also has a generic MR algorithm MR(α): processing aggregation is symmetric and preassociative aggregate func- F o and ⊕ at mapper, ⊕ and T at reducer. Similar to the tion. An aggregation α is preassociative [13] if it satis- behavior of symmetric well-formed aggregation, reducible fies α(Y ) = α(Y 0) =⇒ α(XYZ) = α(XY 0Z). Accord- property is needed to ensure MRC constraints. The re- ing to [13], some symmetric and preassociative(unarily ducible property for asymmetric well-formed aggregation is quasi-range-idempotent and continuous) aggregation func- o o Pn  ∀xi, xi+1 ∈ X¯ : |F (X,¯ xi) ⊕ F (X,¯ xi+1)|= O(1). tions can be constructed as α(X) = ψ i=1 ϕ(xi) , n ≥ 1, where ψ and ϕ are continuous and strictly monotonic func- However, in order to have a correct generic MRC algo- Pn tion. For instance, α(X) = i=1 2 · xi, where ψ = id and rithm for asymmetric aggregation, reducible property is not ϕ(xi) = 2·xi. The well-formed aggregation (F, ⊕,T ) for this enough, because asymmetric function considers data order kind of preassociative aggregation is illustrated as following such that operations for combining mapper outputs are more F = ϕ, ⊕ = +,T = ψ. (4) than ⊕. We illustrate this problem and identify properties to have correct MRC algorithm for a class of asymmetric The corresponding MR(α) is also an MRC algorithm. well-formed aggregation in the following. An aggregate function α is barycentrically associative [14] if it satisfies α(XYZ) = α(Xα(Y )|Y |Z), where |Y | denotes 4.2 Position-based Aggregation with MRC |Y | the number of elements contained in multiset Y and α(Y ) We deal with a kind of asymmetric aggregation α called o o denotes |Y | occurrences of α(Y ). A well-known class of sym- position-based aggregation, for which F is F (X,¯ xi) = metric and barycentrically associative aggregation is quasi- h(i) f(xi), where h() and f() are unary functions, and  1  : α(X) = f −1 Pn f(x ) , n ≥ 1, is a binary operation. The corresponding asymmetric i=1 i P n well-formed framework is α(X¯) = T ( ¯ h(i) f(xi)), ⊕,xi∈X where f is an unary function and f −1 is a quasi-inverse of f. P where ⊕ is the concatenation of ⊕. With different choices of f, α can be different kinds of mean Let X¯ be an ordered sequence X¯ = S¯1 ◦...◦S¯m, where S¯l is functions, e.g arithmetic mean, quadratic mean, harmonic a subsequence of X¯, l ∈ {1, ..., m} and ◦ is the concatenation mean etc. It is trivial to rewrite this kind of aggregation into of subsequence, and i be the holistic position of xi in X¯ and well-formed aggregation (F, ⊕,T ) and the MR(α) is also an j be the relative position of xj in subsequence S¯l. Then MRC algorithm, P o ¯ ⊕ F (X, xi) of α on any subsequence Sl is Pn −1 f(xi) F = (f, 1), ⊕ = (+, +),T = f ( i=1 ). (5) X o ¯ X n F (X, xi) = h(j + k) f(xi), ⊕,x ∈S¯ ⊕,x ∈S¯ 4. ASYMMETRIC AGGREGATION i l j l where j + k (j + k = i) is the holistic position of the jth Many commonly used aggregation function is symmet- ¯ ric(commutative) such that the order of input data can be element xj in Sl. In order to process α in parallel on these ignored, while asymmetric aggregation considers the order. subsequences, the first requirement is to have l, which means Two common asymmetric cases could be weighted aggre- in distributed and parallel computing data set is split into gation and cumulative aggregation, where aggregated result ordered chunks and chunk indexes can be stored. It can be trivially implemented in Hadoop[16]. Secondly, k is needed, will be changed if data order is changed, e.g. WMA(weighted ¯ moving average) and EMA(exponential moving average)[1], the number of elements before Sl. Sequential distributing which are used to highlight trends. subsequence count values then starting aggregation is costly due to too many times of data transferring on network. If k P can be extracted out of ¯ h(j+k) f(xi), then α can 4.1 A Generic Form for Asymmetric Aggre- ⊕,xj ∈Sl gation be processed without distributing counts because operations In contrast to symmetric aggregation, asymmetric func- relating to count can be pushed to reducer. We identify tion is impossible to rewrite into well-formed aggregation, conditions to extract k which we call extractable property. because translating function F is a tuple at a time function Lemma 1. Given an ordered sequence X¯, a position-based and ⊕ is commutative and hence both of them are insensitive o to the order. For this reason, we propose an extended form asymmetric well-formed aggregation α defined in (F , ⊕,T ) o ¯ ¯ based on well-formed aggregation which is more suitable for and F (X, xi) = h(i) f(xi) for any xi ∈ X, where h() asymmetric aggregation. and f() are unary functions, is extractable if there exists a binary operation ⊗ making h() satisfy h(i + k) = h(i) ⊗ Definition 2. An asymmetric aggregation α defined on h(k + c) with a constant c, and ⊕, ⊗ and satisfy one of ¯ an ordered sequence X is an asymmetric well-formed aggre- the following conditions, gation if α can be rewritten as following, o o •⊗ , and ⊕ are same, α(X¯) = T (F (X,¯ x1) ⊕ ... ⊕ F (X,¯ xn)), (6) where F o is order-influenced translating function, ⊕ is a •⊗ and are same and they are distributive over ⊕, commutative and associative binary operation, and T is ter- minating function. •⊗ is distributive over which is same as ⊕. P i−1 For instance, α(X¯) = ¯ (1 − z) xi[14] with a con- xi∈X The behavior of h() is similar to group homomorphism o i−1 stant z can be rewritten as F (X,¯ xi) = (1 − z) xi, ⊕ = however they are not exactly same, and our intention is to +,T = id, where i is the position of xi in the sequence X¯. extract k instead of preserving exact operations. Theorem 2. Let α be a position-based well-formed ag- framework to mainstream parallel computing platforms (e.g. gregation and MR(α) be the generic algorithm for α, then Apache Spark). Moreover, we also plan to extend our frame- MR(α) is an MRC algorithm if α is reducible and extractable. work to cover additional classes of asymmetric aggregations. Finally, we plan to investigate how to generalize our ap- Extractable property of position-based aggregation α al- proach to nested aggregation functions (i.e., functions de- lows previous subsequences count value ’k’ to be extracted fined as a complex composition of aggregation functions). out of mapper operation, then α can be correctly processed by P F o or (P f(x ), P h(i)) at mapper phase. To ⊕ ⊕ i ⊕ 6. REFERENCES combine mapper outputs, more than ⊕ and T are needed [1] Moving average. and specific combining operation depends on the three dif- https://en.wikipedia.org/wiki/Moving_average. ferent extractable conditions (provided in our extended re- [2] Symmetric and asymmetric aggregate function in port[2]). massively parallel computing(extened version). ¯ For instance, given an input sequence X = (x1, ..., xn), https://hal-clermont-univ.archives-ouvertes. Pn (1 − a)i−1 · x ¯ i=1 i fr/hal-01533675. then EMA(X) = Pn i−1 , where a is a constant i=1(1 − a) [3] C.Liu, J.Zhang, H.Zhou, Z. S.McDirmid, and between 0 and 1. We give below the asymmetric well-formed T.Moscibroda. Automating distributed partial i−1 aggregation of EMA, where h(i) = (1 − a) , aggregation. In SOCC’14, pages 1–12, 2014.   [4] S. Cohen. User-defined aggregate functions: bridging F o : F o(X,¯ x ) = h(i) · x , h(i) , i i theory and practice. In SIGMOD’06, pages 49–60,     2006. ⊕ : h(i) · x , h(i) ⊕ h(i + 1) · x , h(i + 1) i i+1 [5] S. COHEN, W.NUTT, and Y.SAGIV. Rewriting   queries with arbitrary aggregation functions using = h(i) · xi + h(i + 1) · xi+1, h(i) + h(i + 1) , views. ACM TODS, 31(2):672–715, June 2006. n n Pn X X i=1 h(i) · xi [6] A. Cuzzocrea. Aggregation and multidimensional T : T ( h(i) · xi, h(i)) = Pn . analysis of big data for large-scale scientific i=1 h(i) i=1 i=1 applications: models, issues, analytics, and beyond. In It is clearly that EMA is a position-based aggregation, and SSDBM’15, 2015. EMA is reducible because ⊕ is a pair of addition. Moreover [7] J. Feldman, S.muthukrishnan, A. Sidiropoulos, h() satisfies h(i + k) = h(i) · h(k + 1), and the correspond- C. Stein, and Z. Svitkina. On distributing symmetric ing three binary operations ⊗ = ·, = ·, ⊕ = + sat- streaming computations. ACM TALG, 6(4), August isfy the second extractable condition. Therefore EMA has a 2010. MRC algorithm(the generic MRC algorithm for the second [8] M. Franklin. An overview of data warehousing and extractable condition) illustrated as following, where we as- olap technology. ACM SIGMOD Record, 26(1):65–74, ¯ ¯ ¯ sume input sequence X = S1 ◦ ... ◦ Sm and mapper input is March 1997. Sl, l ∈ {1, ..., m}, and count(S0) = 0, [9] M. Grabisch, J.-L. Marichal, R. Mesiar, and E. Pap.  0 P 00 P Aggregation function: Means. Information Sciences, • mapper: OM = h(j)·xj , OM = h(j), l xj ∈Sl l xj ∈Sl 181(1):1–22, January 2011. 000  [10] H.Garcia-Molina, J.D.Ullman, and J.Widom. OMl = count(Sl) , System Implementation. Prentice-Hall, New Jersey, 000 m Pl−1 OM 2000. P OM 0 · (1 − a) j=0 j • reducer: l=1 l . [11] J.Gray, A.Bosworth, A.Layman, and H.Pirahesh. Data m 00 Pl−1 OM 000 P j=0 j cube: A relational aggregation operator generalizing l=1 OMl · (1 − a) group-by, cross-tab, and sub-totals. and 5. CONCLUSION AND FUTURE WORK Knowledge Discovery, 1(1):29–53, Janaury 1997. [12] H. Karloff, S. Suri, and S. Vassilvitskii. A model of In this work, we studied how to map aggregation func- computation for . In SODA’10, pages tions, in a systematic way, into generic MRC algorithms 938–948, 2010. and we identified properties that enable to efficiently execute symmetric and asymmetric aggregations using MapReduce- [13] M.Jean-Luc and T.Bruno. Preassociative aggregation style platforms. For symmetric aggregation, we proposed functions. Fuzzy Sets and Systems, 268:15–26, June the reducible property within well-formed aggregation frame- 2015. work to satisfy space and time complexity of MRC. Several [14] M.Jean-Luc and T.Bruno. Strongly barycentrically algebraic properties of symmetric aggregation leading to a associative and preassociative functions. Fuzzy Sets generic MRC algorithm have been identified. Moreover, we and Systems, 437(1):181–193, May 2016. extended the notion of well-formed aggregation to asym- [15] S.Madden, M.J.Franklin, J.M.Hellerstein, and metric aggregation and showed how it can be exploited to W.Hong. Tag: a tiny aggregation service for ad-hoc deal with position-based asymmetric aggregation. Through sensor networks. In OSDI’02, pages 131–146, 2002. identifying the problem for parallelizing it, we proposed ex- [16] V.Raychev, M.Musuvathi, and T.Mytkowicz. tractable property and merged it with the reducible prop- Parallelizing user-defined aggregaions using symbolic erty of asymmetric well-formed aggregation to have MRC execution. In SOSP’15, pages 153–167, 2015. algorithms. [17] Y.Yu, M.Isard, and P. Gunda. Distributed aggregation Our future work will be devoted to the implementation for data-parallel computing: Interfaces and and experimentation. We will study the extension of our implementations. In SOSP’09, pages 247–260, 2009.