<<

arXiv:2004.05517v1 [cs.DB] 12 Apr 2020 00AM 978-1-4503-6735-6/20/06...$15.00 ACM. 2020 © eeec format: Reference ACM workloads. mixed for well eainlMti ler n t mlmnaini Colu a in Implementation its and Algebra Matrix Relational A nltclqeisoe eur itr frltoa and relational of mixture a require o‰en queries Analytical aus eadmxdqeista pl eainladlin- and relational apply that queries mixed demand values, a ler prtoso h aedata. same the on operations algebra ear non-numerical not important are include which also , but these numerical or of purely analysis observations, Œe sen- data. scientific sales example, plants, of point for industrial analyzed, from include data be sor must relational that in parts stored numerical are that data Many INTRODUCTION 1 10.1145/3318464.3389747 DOI: 14 June USA, OR, (SIGMOD’20), Portland, 2020 Data, of Management on Conference In Store. B H. Michael and Augsten, Nikolaus Dolmatova, Oksana performs analytics in- that empiri- suggest and evaluations approach, cal our of feasibility the shows in no MonetDB implementation Our relations; required. in opera- is result structure matrix data and relational additional relations our on performed All are closed: tions is and RMA matrices between relations. dichotomy the eliminates and the into operations algebra linear integrates lessly ae rpssapicpe ouina h oia ee.We level. logical the the introduce Œis at solution level. principled implementation a the proposes at paper problem the fix to the mainly has strived Œis bridge work Previous must matrices. that and data. relations systems between same gap analytic to the challenge to a applied poses operations algebra linear ABSTRACT O:10.1145/3318464.3389747 DOI: USA OR, Portland, o SIGMOD’20, [email protected]. post from permiss to permissions specific Request republish, prior fee. requires or a lists, otherwise, to copy redistribute To to Abstrac or honored. permiŠed. servers be is must credit ACM than with others ing by owned fo work Copyrights this page. first of the nents on c citation f that full the and work and advantage notice this commercial cop this or bear that of profit provided part for fee distributed or or without made all granted not of is use copies classroom hard or or personal digital make to Permission eainlMti ler n t Implementation its and Algebra Matrix Relational A kaaDolmatova Oksana rceig fPo fte22 C IMDInternational SIGMOD ACM 2020 the of Proc of Proceedings dolmatova@ifi.uzh.ch nvriyo Z of University Z rc,Switzerland urich, ¨ eainlmti algebra matrix relational 6pages. 16 urich ¨ RA,wihseam- which (RMA), naClm Store Column a in [email protected] ioasAugsten Nikolaus he.2020. ohlen. ¨ nvriyo Salzburg of University o and/or ion compo- azug Austria Salzburg, e are ies opies -19, mn or n t- eso htterltoa oe swl-utdfrcom- for well-suited is model relational the that show We n nrdc the introduce and matrices. and gap the relations resolves between paper into Œis queries parts. spliŠing matrix and by relational their or re- structures; special key-value intro- in or matrices by lations storing example, by types; for fo- data ordered level, have ducing implementation workloads task. the mixed this on with cused for deal equipped to aŠempts poorly Cur- Previous are bridged. systems be relational must matrices rent and relations between gap schema prtos sa xml,cnie a consider example, an As operations. the in cell each re- describe input relation. and result the identify of uniquely values) and (at- lations non-numerical information and contextual names the tribute from constructed are with relations gins return operations matrix tional from computed is information operations contextual matrix for the order Instead, and structures. relevant relations data on ordered any based introduce purely not is does RMA properly. with are dealt information contextual and ordering if analysis data plex information. non-numerical maintain and process atically in approach. our RMA of of feasibility implementation the Our shows algebra MonetDB linear model. relational integrating the seamlessly into by relations matri- and between dichotomy ces the eliminate We model. linear fa- the algebra and on relational focus the works between Other transition the goals: cilitating these achieve to We first system. the the existing are an prove extending (3) by model and our level, of feasibility physical from the independence achieve at al- so implementation linear (2) the and level, relations logical of the integration at the gebra solve (1) Œe to model. is relational goal the within analysis data complex port aig o h he ls(Blo,”et,ad”e” one ”Net”, query and SQL Œe ”Heat”, film). (”Balto”, per films column three the for ratings epooeapicpe ouinfrmxdworkloads mixed for solution principled a propose We the since challenging is workloads mixed with Dealing eetn h ytxo Q ospotrltoa matrix relational support to SQL of syntax the extend We system- and relations over operations linear define We SELECT ( User * FROM , Balto INV eainlmti algebra matrix relational , Heat rating ( nteagmn eain.Alrela- All relations. argument the in ihe .B H. Michael , nvriyo Z of University Z boehlen@ifi.uzh.ch Net rc,Switzerland urich, ¨ ) htsoe sr n their and users stores that , BY sr); User ohlen urich ¨ ¨ RA osup- to (RMA) origins rating Ori- . with orders the relation by users and computes the inversion of Some operations, e.g., matrix1 multiplication, can be expressed the matrix formed by the values of the ordered numerical via syntactically complex and slow SQL queries. Œe set of columns. Œe result is a relation with the same schema: Œe operations is limited and does not include operations whose values of aŠribute User are preserved, and the values of the results depend on the row order. For instance, there are remaining three aŠributes are provided by matrix inversion no SQL solutions for inversion or determinant computation. (see Section 5 for details). Œe origin of a numerical result Complex operations must be programmed as UDFs. Ordonez value is given by the user name in its row and the at- et al. [25] suggest UDFs for linear regression with a matrix- tribute name of its column. like result type. Œe UTL NLA package[24] forOracleDBMS At the system level, we have integrated our solution into offers linear algebra operations defined over UTL NLA ARRAY. MonetDB. Specifically, we extended the kernel with rela- UDFs provide a technical interface but do not define matrix tional matrix operations implemented over binary associa- operations over relations. No systematic approach to main- tion tables (BATs). Œe physical implementation of matrix tain contextual information is provided. operations is flexible and may be transparently delegated Luo et al. [19] extend SimSQL [8], a Hadoop-based rela- to specialized libraries that leverage the underlying hard- tional system, with linear algebra functionality. RasDaMan ware (e.g., MKL [2] for CPUs or cuBLAS [1] for GPUs). Œe [6, 21] manages and processes image raster data. Both sys- new functionality is introduced without changing the main tems introduce matrices as ordered numeric-only aŠribute data structures and the processing pipeline of MonetDB, and types. Although relations and matrices coexist, operations without affecting existing functionality. are defined over different objects. Linear operations are not Our technical contributions are as follows: defined over unordered objects and they do not support con- textual information for individual cells of a matrix. • We propose the relational matrix algebra (RMA),which SciQL [32, 33] extends MonetDB [22] with a new data extends the relational model with matrix operations. type, ARRAY, as a first-class object. An array is stored as an Œis is the first approach to show that the relational object on its own. Arrays have a fixed schema: Œe last at- model is sufficient to support matrix operations. Œe tribute stores the data values of a matrix, all other aŠributes new set of operations is closed: All relational ma- are dimension aŠributes and must be numeric. Arrays come trix operations are performed on relations and re- with a limited set of operations, such as addition, filtering, sult in relations, and no additional data structure is and aggregation, and they must be converted to relations to required. perform relational operations. Œe presence of contextual • We show that matrix operations are shape restricted, information and its inheritance are not addressed. which allows us to systematically define the results Œe MADlib library [15] for in-database analytics offers of matrix operations over relations. We define row a broad range of linear and statistical operations, defined and column origins, the part of contextual informa- as either UDFs with C++ implementations or Eigen library tion that describes values in the result relation, and calls. Matrix operations require a specific input format: Ta- prove that all our operations return relations with bles must have one aŠribute with a row id value and another origins. array-valued aŠribute for matrix rows. Matrix operations • We implement and evaluate our solution in detail. return purely numeric results and cannot be nested. We show that our solution is feasible and leverages Hutchison et al. [16] propose LARA, an algebra with tuple- existing data structures and optimizations. wise operations, aŠribute-wise operations, and tuple exten- RMA opens new opportunities for advanced data analyt- sions. LARA defines linear and opera- ics that combine relational and linear algebra functionality, tions using the same set of primitives. Œis is a good ba- speeds up analytical queries, and triggers the development sis for inter-algebra optimizations that span linear and re- of new logical and physical optimization techniques. lational operations. LARA offers a strong theoretical basis, Œe paper is organized as follows. Sect. 2 discusses related works out properties of the solution, and allows to store row work. We introduce basics in Sect. 3 and introduce relational and column descriptions during the operations. Œe main- matrix algebra (RMA) in Sect. 4. We show an application tenance of contextual information is not considered for op- example in Sect. 5, discuss important properties of RMA in erations that change the number of rows or columns. Sect. 6, and its implementation in MonetDB in Sect. 7. We LevelHeaded [4, 5], an engine for relational and linear al- evaluate our solution in Sect. 8 and conclude in Sect. 9. gebra operations, uses a special key-value structure: Each object has keys (dimension aŠributes) and annotations (value 2 RELATED WORK 1Some approaches support multi-dimensional arrays. Since we target Relational DBMSs offer simple linear algebra operations, such linear algebra, we focus on two dimensions and use the term matrix as the pair-wise addition of aŠribute values in a relation. throughout. r aŠributes). Dimension and value aŠributes are stored in a d e d  e O V W trie and a flat columnar buffer, respectively. Linear oper- 1 1 2 1 2 3 A 30 1 ations are available through an extended SQL syntax. Key 1 D 1 1 3 1 D 1 3 C 22 5 values guarantee contextual information for rows. However, 2 B 2 2 4 2 B 2 4 the trie key structure restricts relational operations: For ex- B 10 1 ample, aggregations of keys and predicates over non- ,  key aŠributes (i.e., subselects in SQL) are not allowed. Figure 1: Relation r; matrices d e, and d e SciDB [30] is a DBMS that is based on arrays. Matrices and relations are implemented as nested arrays. SciDB fo- 3 PRELIMINARIES cuses on efficient array processing and performs linear al- Œis section presents notation for relations and matrices, gebra operations over arrays. SciDB supports element-wise and introduces the basic matrix algebra operations. operations and selected linear operations, such as SVD. Œe system also offers relational algebra operations on arrays 3.1 Relations but cannot compete with relational DBMSs such as Mon- A relation r is a set of tuples ri with schema R. A schema, etDB in terms of performance. A systematic approach to R = (A, B,...), is a finite, ordered set of aŠribute names. A maintain contextual information is not considered. tuple ri ∈ r has a value from the appropriate domain for Statistical packages, such as R [26] and pandas [20], of- each aŠribute in the schema. We write ri .A to denote the fer a broad range of linear and relational algebra operations value of aŠribute A in tuple ri and r.A to denote the set of over arrays. Each cell may be associated with descriptive all values ri .A in relation r. Ordered subsets of a schema, information, but this information is not always inherited U ⊆ R, are typeset in bold. |r | is the number of tuples in usv as part of operations (e.g., ). No systematic solution for relation r. associating contextual information with numeric results is Let r be a relation and U ⊆ R be aŠributes that form a provided. Œe most important relational operations are sup- key of R. We write r U,k to denote the kth tuple of relation r ported, but even basic optimizations (e.g., join ordering) are sorted by the values of aŠributes U (in ascending order): missing. = U,k Œe R package RIOT-DB [31] uses MySQL as a backend ri r ⇐⇒ ri ∈ r ∧ (1) and translates linear computations to SQL. RIOT-DB addresses |{rj | rj ∈ r ∧ rj .U < ri .U}| = k − 1 the main memory limitations of R, and the optimization of Œe column cast ▽U creates an ordered set L from the SQL statements yields inter-operation optimization. How- sorted values of an aŠribute U that forms a key in relation ever, it is difficult (or sometimes impossible) to express lin- r: ear algebra operations in SQL, and only a few simple opera- L = ▽U ⇐⇒ |L| = |r | ∧ tions, such as subtraction and multiplication, are discussed. , (2) AIDA [10] integrates MonetDB and NumPy [3] and ex- ∀1 ≤ i ≤ |r |(L[i] = rU i .U ) ploits the fact that both systems use C arrays as an internal Œe column cast is used to generate a schema from a set of data structure: To avoid copying NumPy data to MonetDB, values. We use this for operations tra, usv, and opd (see AIDA passes pointers to arrays. Data copying is still needed Table 2). Œe column cast is applicable if the cardinality of to pass MonetDB results to NumPy since MonetDB does not a list of aŠributes U is one. guarantee that multiple columns are contiguous in memory, which is required by NumPy. AIDA offers a Python-like pro- Example 3.1. Consider relation r in Figure 1. Œe third tu- (V ),3 = cedural language for relational and linear operations. Se- ple of relation r sorted by the values of aŠribute V is r ▽ = quences of relational operations are evaluated lazily, which (A, 30, 1),the columncast of O is O (A, B,C),andtheval- = allows AIDA to combine and optimize sequences of rela- ues of aŠribute W are r.W {1, 5, 1}. tional operations. Œe optimization does not include linear We use set notation and apply it to bags. Bags can be or- algebra operations. dered or unordered. To emphasize the difference, parenthe- SystemML [13] offers a set of linear algebra primitives ses are used for ordered bags (or lists), e.g., (3,2,3), and curly that are expressed in a high-level, declarative language and braces for unordered bags, e.g., {3,2,3}. When transition- are implemented on MapReduce. SystemML includes linear ing from unordered to ordered bags, the order is specified algebra optimizations that are similar to relational optimiza- explicitly. tions (e.g., selecting the order of execution of matrix multi- plications). Œe system considers only linear algebra opera- 3.2 Matrices tions. An n ×k matrix m is a two-dimensional array with n rows and k columns. |m| is the number of rows, #m the number of columns. Œe element in the ith row and the jth column Table 1: Shape types of matrix operations of matrix m is m[i, j]; the ith row is m[i, ∗]; the jth column is m[∗, j]. Cardinalities Shape type Operations We consider the operations from the R Matrix Algebra |i1 × j1 | → |i1 × i1 | (r1,r1) USV [27]: element-wise multiplication (EMU), matrix multiplica- |i1 × j1 |, |i2 × j1 | → |i1 × i2 | (r1,r2) OPD tion (MMU), outer product (OPD), cross product (CPD), matrix |i1 × i1 | → |i1 × i1 | (r1,c1) INV, EVC, CHF addition (ADD), matrix subtraction (SUB), transpose (TRA), solve |i1 × j1 | → |i1 × j1 | (r1,c1) QQR |i × j |, |j × j | → |i × j | (r ,c ) MMU equation (SOL), inversion (INV), eigenvectors (EVC), eigen- 1 1 1 2 1 2 1 2 |i × i | → |i × 1| (r ,1) EVL values (EVL), QR decomposition (QQR, RQR), SVD – single 1 1 1 1 |i1 × j1 | → |i1 × 1| (r1,1) VSV value decomposition (DSV, USV, VSV), determinant (DET), rank |i1 × j1 | → |j1 × i1 | (c1,r1) TRA RNK CHF ( ), and Choleski factorization ( ). Note that QR and |i1 × j1 | → |j1 × j1 | (c1,c1) RQR , DSV SVD return more than one matrix, therefore we split the |i1 × j1 |, |i1 × j2 | → |j1 × j2 | (c1,c2) CPD operations: QQR and RQR return matrix Q and matrix R of |i1 × j1 |, |i1 × 1| → |j1 × 1| (c1,c2) SOL the QR decomposition, respectively; DSV, USV, and VSV re- |i1 × j1 |, |i1 × j1 | → |i1 × j1 | (r∗,c∗) EMU, ADD, SUB turn vector D with the singular values, matrix U with the |i1 × i1 | → |1 × 1| (1,1) DET le‰ singular vectors, and matrix V with the right singular |i1 × j1 | → |1 × 1| (1,1) RNK vectors of SVD, respectively. Œe matrix concatenation of matrices m and n with k rows (r ,c ), which states that the number of result rows is equal th ∗ ∗ each returns a matrix h with k rows. Œe i row of h is the to the number of rows of the first matrix and the number of th th concatenation of the i row of m and the i row of n. rows of the second matrix. h = m  n ⇐⇒ |h| = |m| ∧ (3) All operations of the matrix algebra are shape restricted. ∀1 ≤ i ≤ |h|(h[i, ∗] = m[i, ∗] ◦ n[i, ∗]) Œis follows directly from the definitions of the matrix op- Œe schema cast ∆U of aŠributes U creates a matrix m erations [14]. Œe first column of Table 1 lists the relevant (with a single column) from the aŠribute names of U: cardinalities from these definitions. We use shape restric- m = ∆U ⇐⇒ #m = 1 ∧|m| = |U| ∧ tion to determine the inheritance of contextual information. (4) ∀1 ≤ i ≤ |U|(m[i, 1] = U[i]) It has also been used in size propagation techniques [7] for the purpose of cost-based optimization of chains of matrix U = Example 3.2. Consider aŠributes (D, B). Matrix d in operations. Figure 1 is the result of the schema cast d = ∆U. Œe result of concatenating matrix d and matrix e is d  e. Note that 4 RELATIONAL MATRIX ALGEBRA the row and column numbers (cells shaded in gray) in the To seamlessly integrate matrix operations into the relational matrix illustrations are not part of the matrix. model, we extend the relational algebra to the relational ma- Matrix operations are shape restricted, i.e., the number of trix algebra (RMA). For each of the matrix operations we result rows is equal to the number of rows of one of the in- define a corresponding relational matrix operation in RMA: put matrices (r), the number of columns of one of the input emu, mmu, opd, cpd, add, sub, tra, sol, inv, evc, evl, qqr, matrices (c), or one (1). Œe same holds for the number of rqr, dsv, usv, vsv, det, rnk, chf. We use upper case for result columns. matrix operations (e.g., TRA) and lower case for RMA oper- Œe dimensionality of result matrices defines the shape ations (e.g., tra). RMA includes both relational algebra and type of matrix operations. We write r1 if the result dimen- relational matrix operations. Œe new operations behave sionality is equal to the number of rows in the first matrix, like regular operations with relations as input and output. r2 if the result dimensionality is equal to the number of rows For each argument relation, r, of a relational matrix oper- in the second matrix, and r∗ if the result dimensionality is ation one parameter must be specified: Œe order schema equal to the number of rows in the first and second matrix U ⊆ R imposes an order on the tuples for the purpose (i.e., r1 = r2). Œe same notation holds for the number of of the operation. Œe aŠributes of the order schema must columns. Table 1 summarizes the shape types of matrix op- form a key2. Œe aŠributes of relation r that are not part erations. of the order schema U, i.e., U = R − U form the applica- tion schema. Œe application schema identifies the aŠributes Example 3.3. Matrix multiplication has shape type (r1,c2), which states that the number of result rows is equal to the with the data to which the matrix operation is applied. number of rows of the first argument matrix, and the num- 2AŠributes that neither belong to the order schema nor the application ber of columns is equal to the number of columns of the schema must be dropped explicitly with a projection (or added to the order second argument matrix. Matrix addition has shape type schema, thus, forming a super key). Œe order schema U ⊆ R splits relation r into four non- Example 4.5. In Figure 3, a relation constructor is applied overlapping areas: Order schema U; order part r.U; applica- to schema R and the concatenated matrices m  h to con- tion schema U; and application part r.U. Œe parts of r that struct the result relation: v = γ (m  h, R). do not include matrix values, i.e., the order and application Matrix and relation constructors map between relations U U U schemas ( and ) and the order part (r. ), form the con- and matrices. We use constructors and matrices to define textual information for application part r.U. Intuitively, the relational matrix operations and to analyze their properties. order schema and application schema provide context for At the implementation level, constructors are very efficient columns while the order part provides context for rows. since they split and combine lists of aŠribute names and do not access the data (cf. Section 7). r order T H W application 4.2 Relational Matrix Operations schema 5am 1 3 schema Relational matrix operations offer the functionality of ma- 8am 8 5 order application trix operations in a relational context. Œe general form of a 7am 6 7 U part part unary relational matrix operation is opU(r), where is the 6am 1 4 , order schema. A binary operation opU;V(r s) has an addi- tional order schema V for argument relation s. Figure 2: Structure of a relation instance Œe result of a relational matrix operation is a relation that consists of (a) the base result of the corresponding ma- trix operation, and (b) contextual information, appropriately Example 4.1. Order schema U = (T ) splits relation r in morphed from the contextual information of the argument Figure 2 into four parts: Order schema U = (T ), application relations to reflect the semantics of the operation. schema U = (H,W ), order part r.U = r.(T ) = {5am, 8am, 7am, 6am}, and application part r.U = r.(H,W ) = {(1, 3), Definition 4.6. (Base result) Consider a unary relational (8, 5), (6, 7), (1, 4)}. matrix operation opU(r). Œe matrix that is the result of ma- trix operation OP(µU(r)) is the base result of opU(r). Œe base 4.1 Matrix and Relation Constructors result for binary operations is defined analogously. inv Figure 3 summarizes our approach for the operation and Example 4.7. Consider invT (σT >6am(r)) in Fig. 3. Œe base ′ = example relation r σT >6am(r): (1) Two matrix construc- result is matrix h, which results from INV(µT (σT >6am(r))). tors define matrices m and n that correspond to order and Table 2 defines the details of how contextual information application part of r, respectively; (2) INV inverses matrix n is maintained in relational matrix operations. All definitions resulting in matrix h; and (3) the relation constructor com- follow the structure illustrated in Figure 3. A result rela- bines m  h and R into result relation v. tion is composed from order parts, base result, and schemas Definition 4.2. (Matrix constructor) Let r be a relation, U with the help of a relation constructor. For example, inv is be an order schema. Œe matrix constructor µU(r) returns a defined according to its shape tape in Table 2: invU(r) = U U matrix that includes the values of r. sorted by : γ (µU(r)  INV(µU(r)), U◦U), where µU(r) are the rows of the U U m = µU(r) ⇐⇒|m| = |r | ∧ order part, INV(µU(r)) is the base result, and ◦ is the result schema. ∀ ≤ ≤ | |( [ , ∗] = U,i .U) 1 i r m i r Operations that have a different number of rows than any of the input relations add a new aŠribute C to the re- We use the complement notation µU(r) to denote the ma- sult relation. Œis aŠribute C is for contextual information trix that includes the values of r.U sorted by U. (cf. Example 4.8): Its values are either the aŠribute names Example 4.3. Consider Figure 3 with relation instance r ′ = of the application schema of an input relation or the op- σT >6am(r) and schema R = (T, H,W ). Œe matrix construc- eration name. Œe operations add, sub, emu require union ′ tor µT (r ) returns matrix n. compatible application schemas and non-overlapping order schemas. Operations usv, opd, and tra construct the appli- Definition 4.4. (Relation constructor) Letm be a matrix with cation schema of the result from the order schema of an in- unique rows, and R be a relation schema with #m aŠributes. put relation. Œerefore the cardinality of the order schemas Œe relation constructorγ (m, R) returns relation r with schema U of tra and usv, and V of opd must be one. R: Example 4.8. Consider Figures 2 and 4. Figure 4a illus- r = γ (m, R) ⇐⇒ |m| = |r | ∧ trates the result of qqr (r). Œe values of T define the or- ∀ ∃ = T t(t ∈ r ⇐⇒ 1 ≤ i ≤ |m|(t m[i, ∗])) dering of tuples for this operation. Œe values of H and W R T H W ′ = = ′ r σT >6am(r ) m µT (r ) v = γ (g, T ◦ T ) T H W 1 T H W 8am 8 5 µT 1 7am γ 7am -0.19 0.27 7am 6 7 2 8am g = m  h 8am 0.31 -0.23 = ( ′) = INV( ) n µT r h n  1 2 3 1 2 1 2 1 7am -0.19 0.27 µT 1 6 7 INV 1 -0.19 0.27 2 8am 0.31 -0.23 2 8 5 2 0.31 -0.23

= Figure 3: Structure of our solution for the inversion example, v invT (σT >6am(r))

Table 2: Splitting and morphing relations and matrices

Shape type Operations Definition

(r1,r1) usv opU(r) = γ (µU(r)  OP(µU(r)), U ◦ ▽U) , =  , , U ▽V (r1,r2) opd opU;V(r s) γ(µU(r) OP(µU(r) µV(s)) ◦ ) (r1,c1) inv, evc, chf, qqr opU(r) = γ (µU(r)  OP(µU(r)), U ◦ U) , =  , , U V (r1,c2) mmu opU;V(r s) γ(µU(r) OP(µU(r) µV(s)) ◦ ) (r1,1) evl, vsv opU(r) = γ (µU(r)  OP(µU(r)), U ◦(op)) (c1,r1) tra opU(r) = γ (∆U  OP(µU(r)), (C) ◦ ▽U) (c1,c1) rqr, dsv opU(r) = γ (∆U  OP(µU(r)), (C) ◦ U) , , = ∆U  , , V (c1,c2) cpd sol opU;V(r s) γ ( OP(µU(r) µV(s)) (C) ◦ ) , , , =   , , U V U (r∗,c∗) emu add sub opU;V(r s) γ(µU(r) µV(s) OP(µU(r) µV(s)) ◦ ◦ ) (1,1) det, rnk opU(r) = γ (r ◦ OP(µU(r)), (C, op)) are the values of matrix Q computed as part of the QR de- Œis section gives an application example with a mixed work- composition. Figure 4b illustrates the result of traT (r). Œe load that combines relational and linear algebra operations. column cast ▽T of ordering aŠribute T provides names for It maintains all data in regular relations and illustrates the the aŠributes in the transposed relation. Œe result relation importance of maintaining contextual information. has a new aŠribute C whose values are the names of the Consider relations u, f , and r in Figure 5. Relation u aŠributes in the application schema of r. Note that all re- records name, state and year of birth of users; relation f sult relations come with sufficient contextual information records title, release year and director of films; relation r for each value. For example, relation r in Figure 2 records records user ratings for films. Tuple u1 states that user Ann that Humidity (H) was 1 at 6am, which is also recorded in lives in California and was born in 1980; tuple f1 states that the transposed relation in Figure 4b. film Heat was directed by Lee and was released in 1995; tu- ple r1 states that Ann’s ratings for Balto, Heat, and Net are, qqrT (r ) respectively, 2.0, 1.5, and 0.5. T H W traT (r ) 5am 0.1 0.5 C 5am 6am 7am 8am u (user) f (film) r (rating) 6am 0.8 -0.4 H 1 1 6 8 User State YoB Title RelY Director User Balto Heat Net 7am 0.6 0.4 W 3 4 7 5 u1 Ann CA 1980 f1 Heat 1995 Lee r1 Ann 2.0 1.5 0.5 8am 0.1 0.7 (b) Transpose u2 Tom FL 1965 f2 Balto 1995 Lee r2 Tom 0.0 0.0 1.5 (a) QR decomposition u3 Jan CA 1970 f3 Net 1995 Smith r3 Jan 1.0 4.0 1.0

Figure 4: Examples of relational matrix operations Figure 5: Example database

Œe task is to determine how similar each of Lee’s films is 5 RMAINACTION to any other film, based on the ratings from California users. Œe covariance [18] is used to compute this similarity. In ad- w4 w3 w8 C Ann Jan dition, we need relational algebra operations (e.g., selection U B H N T B H N Z B -1.25 1.25 σ, aggregation ϑ, rename ρ, and join ) to retrieve selected Ann -1.25 0.5 0.25 z B 1.56 -0.62 -2.5 H -0.5 0.5 1 ratings and films, aggregate ratings, and combine informa- Jan 1.25 -0.5 0.25 z2 H -0.62 0.25 1 tion from different tables. Œe key observation is that a mix- N -0.25 0.25 ture of matrix and relational operations is required to deter- mine the similarities of the ratings. Figure 7: Steps during the computation Œe solution in Figure 6 includes three key steps: (w1), covariance computation (w2-w7), and re- result of a relational matrix operation can be reduced to the trieving Lee’s films together with all similarities (w8). Note result of the corresponding matrix operation. Origins guar- the seamless integration of linear and relational algebra3: antee that each result relation includes sufficient inherited Œe entire process frequently switches between linear and contextual information to relate argument and result rela- relational operations. tion. We prove that each relational matrix operations is ma- trix consistent and returns a relation with origins.

= ( = ( Z )) w1 πU ,B,H, N σS ’CA’ u r 6.1 Matrix Consistency w2 = ϑ (w1) AVG(B),AV G(H ),AVG(N ) Matrix consistency ensures that the result relation includes = w3 πU ,B,H, N (subU ;V (w1, ρV (πU (w1))× w2)) all cell values that are present in the base result and the or- = w4 traU (w3) der of rows in the base result can be derived from contextual w5 = mmu (w4,w3) information in the result relations. First, we define reducibil- C;U ity to transition from relations to matrices. w6 = w5 × ρM (ϑCOUNT (∗)(w1)) Definition 6.1. (Reducible) Let r be a relation, U be an or- w7 = π , , , (w6) C B/(M−1) H /(M−1) N /(M−1) der schema. Relation r is reducible to matrix m iff m can = Z w8 πT,B,H, N (σD=’Lee’(w7 C=T f )) be constructed from the aŠribute values of U in relation r sorted by U: = Figure 6: Computing the similarity of the ratings r →U m ⇐⇒ µU(r) m ′ Example 6.2. Consider Fig. 3 with relation r = σ > (r), In the following we discuss the algebra expressions in T 6am matrix n, and order schema T . From Example 4.3 we have Figure 6. First, we join u and r to select ratings from Cal- µ (r ′) = n. Relation r ′ is reducible to matrix n since n can ifornia users (w1). Next, we compute the covariance using T 1 be constructed from the values of H and W in the argument its standard definition [18]: cov(X,Y ) = [(X − E[X ]) ∗ ′ n−1 relation sorted by T , i.e., r →T n. (Y − E[Y ])T]. Œe expectation of an aŠribute, e.g., E(H) = ϑAVG(H )(...), is computed via aggregation (w2). Relational Definition 6.3. (Matrix consistency) Consider a unary ma- matrix operations, sub, tra and mmu, are used to subtract (X − trix operation OP(m). Œe corresponding relational matrix E[X ]), transpose (T), and multiply (∗) relations (w3,w4,w5). operation op is matrix consistent iff for all relations r that Next, we compute the unbiased covarinace (w6,w7). Finally, are reducible to matrix m, the result relation opU(r) is re- we join w7 and f to select Lee’s films. ducible to OP(m): ∀ U = ∃U′ Figure 7 illustrates relations w3, w4, and w8. Consider r,m, (r →U m ⇒ (opU(r) →U′ OP(m))) transpose traU (w3) with order schema U and application A binary relational matrix operation is matrix consistent if schema U = (B, H, N ). Œe result of this operation is a re- its result is reducible to the result of the corresponding bi- lation w4 with schema (C,Ann,Jan). Œe values of aŠribute nary matrix operation. C are the aŠribute names in the application schema of w3. Example 6.4. Consider Figures 2 and 8 with relation r, ma- Note that each operation preserves schema and ordering trix g, matrix RQR(g) and relation rqr (r). information as the crucial parts of contextual information. T • → Œis makes it possible to interpret the tuples in result rela- r T g: Relation r is reducible to matrix g • rqr (r) → RQR(g): relation rqr (r) is reducible to tion w8. For example, tuple z states that Lee’s film Balto T C T 1 RQR( ) has the smallest covariance to film Net. matrix g 6.2 Origins of Result Relations 6 PROPERTIES OF RMA Œe result of a relational matrix operation is a relation that, Œis section defines two crucial requirements for relational in addition to the base result, includes a row origin and a matrix operations. Matrix consistency guarantees that the column origin. Origins (1) uniquely define the relative posi- 3We use the first character of the aŠribute name to refer to aŠributes. tioning of result values, (2) give a meaning to values with g Column origins (co) are marked by rectangles (all values in- 1 2 RQR(g) rqrT (r ) side a rectangle form together the column origin for the re- 1 1 3 1 2 C H W lation). Row origins (ro) are marked by ellipses. 2 1 4 1 -10.1 -8.8 H -10.1 -8.8 3 6 7 2 0.0 -4.6 W 0.0 -4.6 p2 = usvT (r ) 4 8 5 p1 = rnkH (πH,W (r )) C rnk T 5am 6am 7am 8am Figure 8: Example of matrix consistency r 1 5am -0.2 0.5 -0.8 0.4 ro = r = (r) 6am -0.3 0.6 0.6 0.4 co = op = (rnk) 7am -0.7 0.2 0.0 -0.7 respect to the applied operation, and (3) establish a connec- 8am -0.7 -0.6 0.0 0.4 tion between argument relations of an operation and its re- ro = r .U = (5am, 6am, 7am, 8am) sult relation. r co = ▽U = (5am, 6am, 7am, 8am) T H W = Example 6.5. Consider inversion and result relation v in p3 qqrW ,T (r ) 5am 1 3 W T H Figure 3. Values 7am and 8am show that (1) value -0.19 pre- 8am 8 5 3 5am 0.1 cedes value 0.31 because 7am precedes 8am; (2) -0.19 is the 7am 6 7 4 6am 0.1 inversion value for humidity and for time 7am; (3) value - 6am 1 4 0.19 in relation v is connected to value 6 in the argument 5 8am 0.7 relation since they have the same origins (7am and H). 7 7am 0.5 ro = r .U = ((3,5am), (4,6am), (5,8am), (7,7am)) Origins are either inherited order schemas, application co = U = (H) schemas from argument relations, or constants. Œe shape Figure 9: Examples of origins type of an operation determines the cardinality of inherited contextual information. Œe indices in the shape type spec- For p2 = usv (r), we have U = (T ), U = (H,W ), U′ = (T ), ify the input relation, from which an origin is inherited. For T U′ example, if the first element of the shape type is c , the row and = (5am, 6am, 7am, 8am). Œe shape type of usv is (r1, 1 . = . ▽ origin is the schema cast of the application schema of the r1) (see Table 2) this makes p2 T r T a row origin and T = (5am, 6am, 7am, 8am) a column origin. first argument relation. Note that indices ∗ and 2 are only possible for binary operations. 6.3 Correctness = op Definition 6.6. (Origins) Consider a unary, v U(r), or Theorem 6.8. All relational matrix operations return ma- = op ( , ) binary, v U;V r s , matrix consistent operation with trix consistent relations with a row and column origin. shape type (x,y), base result m, and aŠribute list U′ such ′ that v →U′ m. Consider Table 3. v.U is a row origin iff Proof. First, we prove that relational matrix operations v.U′ is equal to ro for the given shape type x. U′ is a column are matrix consistent. Second, we show that inherited con- origin iff U′ is equal to co for the given shape type y. textual information in the result corresponds to Definition 6.6. (1) Consider a unary operation with shape type (x,y). We Table 3: Definition of origins for shape type (x,y) start with the definition of matrix consistency (Definition 6.3), ′ ′ U equal to U (x =r1) or C (x=c1), and U equal to U (y=c1) x ro y co or ▽U (y=r1). We instantiate the implication with the defini-

r1 r .U r1 ▽U tion of the operation in Table 2. Œen, we simplify the right r2 s .V r2 ▽V hand side of the implication, substituting it with the equality c1 ∆U c1 U in Definition 6.1. Next, we expand the equality with the defi- c2 ∆V c2 V nitions of relational and matrix constructors (Definitions 4.2 r∗ (r .U, s .V) r∗ ▽U and 4.4). A series of simplifications yields the equality of the c∗ (∆U, ∆V) c∗ U right and le‰ hand side. ′ ′ ′ ′ 1 r 1 op (2) Consider operation v = qqrU(r) with shape type (r1, c1). Matrix consistency of qqr was shown in the first part of U = U′ Example 6.7. Figure 9 illustrates relation r and the origins the proof. Œe shape type of qqr is (r1,c1) and we get , U = U′ U U for operations and . v. is the row origin because x=r1 and is the column origin because y=c (see Table 3). • rnk (π (r)) with shape type (1,1) 1 H H,W Œe same reasoning applies to the other types of opera- usv • T (r) with shape type (r1,r1) tions in Table 2. Œus, each relational matrix operation re-  • qqrW ,T (r) with shape type (r1,c1) turns a result relation with a row and a column origin. Œe following example uses a sequence of tra operations 7.1 MonetDB to illustrate the importance of origins. A result relation with MonetDB stores each column of a table as a binary associa- origins inherits sufficient contextual information, such that tion table (BAT). A BAT is a table with two columns: Head each value can be interpreted. Origins also carry sufficient and tail. Œe head is a column with object identifiers (OID), information about the order of rows, such that in sequences while the tail is a column with aŠribute values. All aŠribute of relational matrix operations no ordering information is values of a tuple in a relation have the same OID value. Œus, lost between operations. a tuple can be constructed by concatenating all tail values Example 6.9. Consider a relation instance r that is reducible with the same OID. MonetDB operations manipulate BATs to matrix n. Figure 10 shows the matrix expression TRA(TRA(n)) and relational operations are represented and executed as and the respective relational matrix expression traC (traT (r)). sequences of BAT operations. Example BAT operations are Operation T tra(r) returns relation r1, which in addition to B1 ∗ B2, B1/B2, and B1 − B2 for element-wise multiplication, the application schema (aŠributes 5am,6am,7am,8am) also division, and subtraction, and sum(B) to sum the values in includes aŠribute C, which is preserved together with the BAT B. application schema. Example 7.1. MonetDB stores relation σT >6am(r) from Fig- r n ure 3 in three BATs as shown in Figure 12. T H W 1 2 5am 1 3 1 1 3 T H W 8am 8 5 r →T n 2 1 4 OID Val OID Val OID Val 7am 6 7 3 6 7 0 8am 0 8 0 5 6am 1 4 4 8 5 1 7am 1 6 1 7 tra T (r) TRA(n) Figure 11: BAT representation of σT >6am(r) r 1 n1 C 5am 6am 7am 8am 1 2 3 4 H 1 1 6 8 r1 →C n1 1 1 1 6 8 One important BAT operation is le‡fetchjoin (↓), which W 3 4 7 5 2 3 4 7 5 returns a BAT with OIDs sorted according to the order of tra (r1) TRA(n1) OIDs of another BAT from the same relation. For instance, C X ↓ Y returns BAT X , whose OIDs have the same order as r 2 n2 OIDs of BAT Y . X ↓X denotes X sorted by its own values. C H W 1 2 5am 1 3 1 1 3 7.2 RMA Integration 6am 1 4 r2 → n2 2 1 4 C As a first step, we have extended the SQL parser to make 7am 6 7 3 6 7 8am 8 5 4 8 5 the relational matrix operations available in the from clause of SQL [9]. Œe syntax (r BY U) the specifies ordering for an argument relation r. As an example, consider relations r Figure 10: Origins and matrix consistency and s and ordering aŠributes U and V. Œe unary operation

invU and the binary operation mmuU;V are expressed as: 7 IMPLEMENTATION SELECT * FROM INV(r BY U); We discuss the integration of our solution into MonetDB. SELECT * FROM MMU(r BY U, s BY V); Œe implementation of relational matrix operations includes Œese basic constructs can be composed to more complex the processing of contextual information and the computa- expressions. For instance, folding w5, w6 and w7 from Fig- tion of the base result. Contextual information is handled ure 6 yields the RMA expression: inside MonetDB, while the computation of the base result can be done in MonetDB or delegated to external libraries πC,B/(M−1),H /(M−1), N /(M−1)( (e.g., MKL). Œe integration of each relational matrix oper- mmuC;U (w4,w3)× ρM (ϑCOUNT (∗)(w1))) ation requires extensions throughout the system, but does not change the query processing pipeline and no new data Œe SQL translation of this expression is: structures are introduced. To extend MonetDB with addi- SELECT C, B/(M-1), H/(M-1), N/(M-1) tion, QR decomposition, linear regression, and the transfor- FROM MMU(w4 BY C, w3 BY U) AS w5 mation of numerical data to the MKL format we touch 20 CROSS JOIN (out of 4500) files and add 2500 lines of code. ( SELECT COUNT (*) AS M FROM w1 ) AS t; 1 processes a node that represents a unary re- calls based on the complexity of the operation, the amount lational matrix operation opU(r) and translates it to a list of of data to be copied, and the relative performance of the ma- BAT expressions. In lines 2 - 7, the BATs of relation r are trix operation in MonetDB compared to the external library. split, sorted, and morphed to get BATs X with row origins Œe no-copy implementation of matrix operations in the and BATs Y with the application part. Spliˆing (lines 2 and kernel of MonetDB is performed over BATs directly. Essen- 4) divides a relation into two parts: Œe application part, on tially, standard must be reduced to BAT opera- which the matrix operations are performed, and the contex- tions. Œe process of reducing is highly dependent on the tual information, which gives a meaning to the application operation. Œe goal is to design algorithms that access en- part. BATs B are split into application part and order part ac- tire columns and minimize accesses to single elements of cording to U. Sorting (lines 3 and 4) determines the order of BATs. To achieve this standard value-based algorithms must the tuples for a specific matrix operation. Œe order schema be transformed to vectorized BAT operations. U is used to sort the BATs: BATs in U are sorted according to their values while the other BATs in B are sorted accord- Algorithm 2: INV(B) ing to the OIDs of the BATs in U. Œe order is established 1 n ← B.length; BR ← IDmatrix(n); for each operation based on the contextual values in the re- 2 for i = 1 to n do lation. Morphing (lines 5-7) morphs contextual information 3 v1 ← sel(Bi, i); Bi ← Bi /v1; BRi ← BRi /v1; so that it can be added to the base result. Finally, the matrix 4 for j = 1 to n do operation is applied to Y (line 8). Merging (line 9) combines 5 if i , j then the result of a matrix operation with relevant contextual in- 6 v2 ← sel(Bj,i); Bj ← Bj − Bi ∗ v2; formation and constructs the result relation with row and 7 BRj ← BRj − BRi ∗ v2; column origins. Merging and spliŠing are efficient opera- tions that work at the schema level and do not access the 8 return BR; data. Algorithm 2 illustrates the reduction for the Gauss Jor- Algorithm 1: UnaryRMA(op, U, r) dan elimination method for the INV computation. Œe al- 1 B ← BATs(r); Y ← {} ; gorithm takes a list of BATs B = (B1, B2,.., Bn ) and returns 2 D ← {b|b ∈ B,b is in order schema U} ; the inversion as a list of BATs BR of the same size. Func- 3 G ← sort(D) ; tion IDmatrix(n) creates a list of BATs that represents the 4 for b ∈ B \ D do Y ← Y ∪ b ↓G; identity matrix of size n×n. Œe selection operation sel(B, i) 5 if ShapeType(op) ∈ {(r,r), (r,c), (r,1)} then X ← G; returns the ith value in B. With the exception of the sel oper- 6 else if ShapeType(op) ∈ {(c,r), (c,c)} then X ← newBAT (Y ); ation, all operations are standard MonetDB BAT operations 7 else X ← newBAT (r); that are also used for relational queries. For example, the 8 F ← eval(op,Y ); operation on Bi ← Bi /v1; divides each element of a BAT 9 return Concat(X, F); with a scalar value. 8 PERFORMANCE EVALUATION = Example 7.2. Figure 12 illustrates Algorithm 1 for v Setup. All runtimes are averages over 3 runs on an Intel(R) inv = , , T (σT >6am(r)). Spliˆing: input list B (T H W ) is split Xeon(R) E5-2603 CPU, 1.7 GHz, 12 cores, no hyper-threading, = = into order list D (T ) and application list B \ D (H,W ). 98GB RAM (L1: 32+32K,L2:256K,L3:15360K),Debian 4.9.189. Sorting: BAT T is sorted, producing G. Œen, (H,W ) are sorted according to G returning (H ↓ T,W ↓ T ). Morphing: Competitors. We empirically compare the implementations 4 Since inv is of shape type (r1,c1), row contextual informa- of our relational matrix algebra (RMA+ ) with the statisti- tion is the order schema: X = T ↓ T . Merging: BATs X are cal package R, the array database SciDB, and two state-of- concatenated with BATs F to form the result. the-art in-database solutions, AIDA [10] and MADlib [15]. (1) We implemented RMA+ in MonetDB (v11.29.8) with two 7.3 Computing the Base Result options for matrix operations: (a) BATs (RMA+BAT): No- Line 8 of Algorithm 1 calls the procedure that computes the copy implementation in the kernel of MonetDB; (b) MKL matrix operation. Œe computation can be done either in (RMA+MKL): Copy BATs to an MKL (v2019.5.281) [2] com- the kernel of MonetDB or by calling an external library (e.g., patible format (contiguous array of doubles), then copy the MKL [2]). Calling an external library requires copying data result back to BATs. We execute linear operations (add, sub, from BATs to the external format and copying the result emu) on BATs and use MKL for more complex operations. back. Œe query optimizer decides about external library 4Œe implementation can be found here: hŠps://github.com/oksdolm/RMA input spliŠing sorting morphing eval merging

σT >6am(r) D B \ D G Y X F Concat(X, F)

T H W T H W T↓T H↓T W↓T T↓T H W T H W 8am 8 5 8am 8 5 7am 6 7 7am -0.19 0.27 7am -0.19 0.27 7am 6 7 7am 6 7 8am 8 5 8am 0.31 -0.23 8am 0.31 -0.23

= Figure 12: Splitting, sorting, morphing, merging for query v invT (σT >6am(r))

When the matrices do not fit into main memory we switch to of order columns. We compute add and qqr on these rela- BATs. Due to the full integration of RMA+, MonetDB takes tions. Since add and qqr are inexpensive for single column care of core usage and work distribution, and all cores are matrices, the main cost is the maintenance of the order part. used for relational and for matrix operations. (2) SciDB [30] To handle contextual information we split, sort, morph, uses an array data model, and queries are expressed in the and merge lists of BATs (cf. Section 7.2). Sorting is the most high-level, declarative languageAQL (Array ‹ery Language)[29].expensive operation. Fortunately, sorting is not always nec- SciDB uses all available cores. (3) AIDA is a state-of-the-art essary. For example, permuting the input rows for the qqr solution for the integration of matrix operations into a re- operation will affect the order of the result rows, but will lational database and was shown to outperform other solu- not change their values. Œerefore, sorting is not required. tions like Spark or the pandas library for Python [10]. AIDA In element-wise operations like add, emu, or sol, only the executes matrix operations in Python and offers a Python- relative order of the rows in the two input relations maŠers. like syntax for relational operations, which are then trans- Œus, only the order part of the second relation requires sort- lated into SQL and executed in MonetDB (v11.29.3). AIDA ing (to get the same order). uses all cores both in MonetDB and in Python. We also in- Figure 13 shows the results. (1) Handling contextual infor- tegrate our solution into MonetDB, which makes AIDA a mation is efficient and scales to large numbers of aŠributes. particularly interesting competitor. (4) MADlib [15] (v1.10) (2) Œe optimized operators that (partially) avoid sorting clearly provides a collection of UDFs for PostgreSQL (v9.6) for in- outperform their non-optimized counterparts. database matrix and statistical calculations. MADlib does 4 4 not use multiple cores, which affects its overall performance. add add qqr add, relative sorting (5) Œe R package (v3.2.3) is highly tuned for matrix oper- 3 add, relative sorting 3 qqr qqr, w/o sorting qqr, w/o sorting ations and is a representative of a non-database solution. 2 2 R performs all relational operations with data.tables struc- 1 Runtime (sec) tures and transforms the relevant columns to matrices to Runtime (sec)1 compute the matrix operations. An alternative approach to 0 200 400 600 800 1,000 20 40 60 80 100 use character matrices for all operations is very inefficient #aŠributes in order scpecification #aŠributes in order schema (cf. Section 8.5). R uses all cores for matrix operations but (a) 100K tuples (b) 1M tuples runs relational operations on a single core. Data. BIXI [17] stores trips and stations of Montreal’s Figure 13: Handling contextual information public bicycle sharing system, years 2014-2017. DBLP [23] stores authors with their publication counts per conference Note that a number of operations (cpd, sol, rqr, dsv, tra, as well as conference rankings. Œe synthetic dataset used det, rnk) do not preserve row context since the number of in the experiment to the effect of sparsity includes rows changes. Instead, a single column with predefined val- values between 0 and 5,000,000. All other synthetic datasets ues (operation name or aŠribute names of the application include real-valued numeric aŠributes with uniformly dis- schema) is created, which is negligible in the overall run- tributed values between 0 and 10,000. time. 8.1 Maintaining Contextual Information 8.2 Wide and Sparse Relations A salient feature of our approach is that contextual infor- Wide relations. Current databases scale beŠer in the num- mation is maintained during matrix operations. We analyze ber of tuples than in the number of aŠributes. We test our the scalability of maintaining context and study an optimiza- RMA+ implementation in MonetDB on wide relations. We tion that avoids sorting. To this end, we generate relations generate relations with 1000 tuples, one order aŠribute, and with a single application column and an increasing number a varying number of application aŠributes. In Table 4, we increase the number of aŠributes from 1K to 10K and mea- 8.4 RMA+ vs. Array Databases sure the runtime of the add operation. MonetDB can han- We study the performance of RMA+ vs. SciDB [30] as a rep- dle wide relations with several thousands of aŠributes, even resentative of array databases. We compute add on two ma- though the runtime per column increases with the aŠribute trices with 10 columns and a varying number of rows, fol- number. lowed by a selection 5. Œe resulting runtimes for Ubuntu are shown in Table 7. RMA+ outperforms SciDB by more Table 4: add over wide relations in RMA+ than an order of magnitude. RMA+ performs addition di-

#attr 1K 2K 3K 4K 5K 6K 7K 8K 9K 10K rectly over pairs of relations, while SciDB must compute a sec 0.6 2.2 4.8 8.8 13.4 20 27 36 47 62 so-called array join [29] over the input arrays in order to add their values.

Sparse relations. We analyze the effect of MonetDB’s built- Table 7: add followed by a selection: RMA+ vs. SciDB in compression on relations with many zeros. We add two #tuples 1M 5M 10M 15M relations (5M tuples, one order, 10 application aŠributes) RMA+ 4.6s 24.4s 1m18s 1m39s with uniformly distributed non-zero values (range 1-5M). In SciDB 1m21s 7m6s 13m2s 18m23s Table 5 we increase the percentage of zero values (position of zeros is random) and measure the runtime: Œe add oper- ation on sparse matrices is up to two times faster than the same operation on dense matrices. Œus, RMA+ leverages 8.5 Overhead of MonetDB’s compression features. We investigate the overhead of data transformation for vari- ous matrix operations in a mixed relational/matrix scenario. Table 5: add over sparse relations in RMA+ RMA+ is free to execute matrix operations directly on BATs or rearrange the numerical data in main memory and % 0 10 20 30 40 50 60 70 80 90 100 delegate the matrix operations to specialized packages like sec 1.68 1.60 1.49 1.41 1.33 1.25 1.16 0.99 0.94 0.89 0.76 MKL [2]. R does not enjoy this flexibility: R uses the matrix data type for matrix operations and the data.tables storage structure for relational operations. While data.tables sup- 8.3 RMA+ vs. Non-Database Approaches ports simple linear operations like linear model construc- We study the scalability of RMA+ to large relations and com- tion, the data must be transformed to the matrix type for pare to R as a non-database solutions for matrix operations. more complex operations like CPD, OPD, or MMU. Matrices In Table 6 we measure the runtime for qqr on tables with up cannot store a mix of numerical and non-numerical values, to 100M tuples and 70 aŠributes in the application schema. which is required when working with tables; R offers charac- For relations up to a size of 50Mx40, RMA+ delegates the ter matrices, but they are very inefficient, e.g., joining trips matrix computation to MKL; the runtime includes copying and stations in the BIXIdataset takes 40sec for the character the data. RMA+ is consistently faster than R since MKL can matrix type and less than 2 sec for data.tables. beŠer leverage the hardware. R fails for sizes above 50Mx40 Figure 14 shows the percentage of time spent for data since it runs out of memory. In RMA+ we switch to the BAT transformations on relations with 50 columns and a vary- implementation, which leverages the memory management ing number of rows (100k to 500k). For R we measure the of MonetDB. Œe Gram-Schmidt qqr baseline [12] that we time of transforming the relation from data.table to matrix implemented over BATs is slower than the MKL algorithm and back as a percentage of the overall query time, which in- (e.g., 834 vs. 61.4 sec for 50Mx40), which explains the in- cludes the actual matrix operation. For RMA+ we measure crease in runtime. RMA+ scales to large relations that do not the time share for copying the data from a list of BATs to a fit into memory (e.g., relation size 100Mx70 requires 56GB). contiguous, one-dimensional array for MKL, and for copy- ing the result back; the overall runtime in addition includes Table 6: Runtimes of qqr in seconds in R and RMA+ the matrix computation in MKL (but excludes the MonetDB query pipeline of query parsing, query tree creation, etc.). 10 attr 40 attr 70 attr Clearly, the overhead of transforming data maŠers for System R RMA+ R RMA+ R RMA+ both R and RMA+. We draw the following conclusions: (a) 5M tup 3.5 2.1 20 6.6 47 11.6 Transforming data between data structures is costly. (b) For 50M tup 37 21.3 221 61.4 fail 2018 100M tup 74 40 fail 1690 fail 4064 5We run this experiment on Ubuntu 14.04 since SciDB does not support Debian; Ubuntu runs on a server with 4 cores and 16GB of RAM. #rows (#columns = 50) #rows (#columns = 50) which adversely affects the relational performance. MADlib 500K 81 75 64 21 7 7 500K 92 92 86 53 44 43 is outperformed by all other solutions due to the slow com- 300K 79 77 63 21 7 7 300K 91 91 86 55 45 40 putation of the linear regression. RMA+ outperforms AIDA 100K 84 74 69 23 9 10 100K 86 86 80 48 37 35 on all datasets. Although both RMA+ and AIDA compute ADD EMU MMU QQR DSV VSV ADD EMU MMU QQR DSV VSV the relational operations in MonetDB, RMA+ is up to 6.3 (a) Data.table and matrix (b) List of BATs and 1D array times faster: While AIDA passes pointers to access numer- ical Python data in MonetDB, this does not work for other Figure 14: Data transformation share: (a) R, (b) RMA+ data types (e.g., date, time, string) due to different storage formats [11]. Œerefore, expensive data transformations must be applied. simple operations like ADD and EMU, the transformation over- head dominates the overall runtime (up to 92%). (c) For com- 5 plex operations, the performance of the matrix operation 25 RMA+ AIDA R MADlib RMA+MKL RMA+BAT 4 dominates the overall runtime. 20 15 3

10 2

8.6 Efficiency for Mixed Workloads Runtime (sec) Runtime (sec) 5 1 We analyze four workloads that require a mix of relational 0 0 3.1 6.5 10.5 14.5 3.1 6.5 10.5 14.5 operations and matrix operations, and we compare our im- #tuples (M) #tuples (M) plementation of RMA (RMA+) to its competitors (R, AIDA, (a) System Comparison MADlib). Œe workloads stem from applications on our real- (b) RMA+BAT vs RMA+MKL world datasets and differ in the complexity of relational vs. Figure 15: Trips (Ordinary Linear Regression) matrix part. On the BIXI dataset, we compute (1) the lin- ear regression between distance and duration for individual (2) Journeys – Multiple Linear Regression. We compose trips trips, and (2) journeys connecting up to 5 trips; on DBLP we that meet in a station into journeys. We start from 15M one- compute the (3) covariance between conferences based on trip journeys of the form (start station, end station, dura- the publication counts per conference and author; (4) on a tion); all aŠributes are numerical. During data preparation, synthetic dataset based on BIXI we count trips per rider. we perform joins to create journeys of up to five trips, select those that appear at least 50 times, and join stations with (1) Trips – Ordinary Linear Regression. Trips in BIXI in- their coordinates to compute the distances between subse- clude start date and start station, end date and end station, quent stations in a journey. At the matrix level, we do a duration, and a membership flag for the rider; stations have multiple linear regression analysis with the distances as in- a code, a name, and coordinates. At the level of relations, we dependent variables and the overall duration as the depen- need to perform the following data preparation steps: (a) Ag- dent variable. gregate the trips and select those trips that were performed Figure 16a shows the runtime for journey lengths of 1 to at least 50 times; (b) join trips and stations to retrieve the sta- 5 trips (i.e., 1 to 5 independent variables). Œe solid part tion coordinates and compute the distance. We use the OLS is the time for data preparation (relational operations); the method [28] to compute the linear regression between dis- dashed light part is the time for multiple linear regression tance and duration. OLS uses cross product, matrix multipli- (matrix operations). RMA+ and AIDA again outperform R cation, and inversion: MMU(INV(CPD(A, A)), CPD(A,V )), where on the relational part of the query. Œe relational part oper- A is the matrix with the independent variables, and V is the ates on purely numerical data and AIDA shows comparable vector with the dependent variable. join performance to RMA+. MADlib spends about two third Figure 15a shows the runtime results for trips reported of the relational runtime on distance computations and is in the years 2014 (3.1M trips), 2014-2015 (6.1M trips), 2014- therefore slower than its competitors also on the relational 2016 (10.5M trips), and 2014-2017 (14.5M trips), respectively. part. Œe input data consists of numeric and non-numeric types such as date and time. We break the runtime down into (3) Conferences – Covariance Computation. We compute data preparation (solid area of the bar) and matrix compu- the covariance between conferences with A++ rating to lower tation time (dashed light area) for RMA+, R, and AIDA; for rated conferences based on the number of publications per R we also show the load time from a CSV file (dark area). author and conference. Œe data includes two tables: ranking RMA+ and AIDA outperform R and MADlib in all scenar- stores a rating (e.g., A++, A+, B) for each conference. publication ios. R performs poorly on the relational operations of the stores the number of publications per author and confer- data preparation step: Œe join implementation of R does ence; the first aŠribute is the author, the other aŠributes not leverage multiple cores, and R lacks a query optimizer, are conference names (i.e., the result of SQL PIVOT over a 10 200 RMA+ AIDA R MADlib RMA+MKL RMA+BAT relations of two different years to get the trip count for a 8 150 period of two years. We vary the number of riders from 6 1M to 15M and measure the runtime. Since add is a simple 100 4 operation, RMA+ uses the no-copy implementation on BATs Runtime (sec) 50 Runtime (sec) 2 (RMA+BAT). RMA+ is faster than AIDA and R because it 0 does not transfer data to Python (as AIDA) and does not 1 2 3 4 5 0 1 2 3 4 5 #trips #trips translate data.tables to matrices (as R). MADlib takes 23, 119, (a) System Comparison (b) RMA+BAT vs RMA+MKL 299, resp. 480 seconds for the different input sizes and, thus, is again omiŠed from the figure. Figure 16: Journeys (Multiple Linear Regression) 7 7 RMA+ AIDA R RMA+MKL RMA+BAT count-aggregate by conference and author). Œe query com- 6 6 putes the covariance matrix on publication and joins the re- 5 5 sult with ranking to select A++ conferences. 4 4 We measure the runtime for publication tables of increas- 3 3

Runtime (sec) 2 Runtime (sec) 2 ing sizes: (1) 337363x266(i.e., 337363authorsand 266confer- 1 1 ences), (2) 550085x519, (3) 722891x744, and (4) 876559x882. 0 1 5 10 15 0 1 5 10 15 Œe ranking table stores 882 tuples. Note that the number of #tuples(M) #tuples(M) result rows of covariance is identical to the number of input (a) Systems Comparison (b) RMA+BAT vs RMA+MKL columns, e.g., covariance of publications with 266 columns Figure 18: Trip Count (Matrix Addition) returns a relation (or matrix) of size 266x266. Figure 17a shows the runtime results for RMA+, R, and RMA+BAT vs. RMA+MKL. Following our policy, RMA+ AIDA. MADlib runs for 77, 429, 1086, resp. 1814 seconds delegates matrix operations to MKL (RMA+MKL) in Figures 15a, on the different relation sizes and, thus, is omiŠed from the 16a, and 17a (the operations are complex and we do not figure. In all systems, the covariance computation domi- run out of memory), and uses the no-copy implementation nates the overall runtime with at least 90%. Since AIDA does on BATs (RMA+BAT) in Figure 18a (add is a linear opera- not support covariance, we implement covariance via cross tion). We compare RMA+BAT to RMA+MKL in all scenar- product [18] in all algorithms except MADlib, which has a ios. RMA+MKL outperforms RMA+BAT for the queries on cov() function but does not support cross product. For the trips (factor 1.8-3.8, cf. Figure 15b) and journeys (factor 1.4- cross product in RMA+ we use the routine cblas dsyrk() 1.9, cf. Figure 16b). For the conference query, RMA+MKL is since the result of multiplication is symmetric, in AIDA we 24 to 70 times faster since the cross product requires single use a.t@a, in R we use crossproduct 6. element access and operates on relations with a large num- Note that the covariance computations in AIDA and R do ber of aŠributes. For the trip count, RMA+BAT outperforms not return contextual information. In order to join the result RMA+MKL in all seŠings (cf. Figure 18b). Although elemen- with ranking and to select all A++ conferences, the confer- twise addition is highly efficient in MKL, the transformation ence names must be manually added as a new column. overhead cannot be amortized.

30 1,000 RMA+ AIDA R RMA+MKL RMA+BAT 8.7 Discussion 25 800 Œe key learnings from our empirical evaluation are the fol- 20 600 lowing: (1) RMA+ excels for mixed workloads that include 15 400 10 both standard relational and matrix operations. (2) Only Runtime (sec) Runtime (sec) 5 200 RMA+ can avoid data transformations in mixed workloads; 0 data transformations may be costly and consume more than 1 2 3 4 0 1 2 3 4 size of publications table size of publications table 90% of the overall runtime. (3) For complex matrix opera- (a) Systems Comparison (b) RMA+BAT vs RMA+MKL tions, however, transforming the data to a suitable format may pay off: In our approach, we are free to transform the Figure 17: Conferences (Covariance Computation) data whenever beneficial. (4) In terms of scalability to large relations/matrices, our solution outperforms all competitors (4) Trip Count. In Figure 18 we compute the number of since it relies on the memory management of the database trips per rider to 10 different destinations. Each tuple in system for both the standard relational and the matrix oper- the input relations stores a rider and the number of trips ations. (5) Finally, the handling of contextual information, a to each of the 10 locations for one year. We use add on the feature of RMA, is efficient and can leverage optimizations 6We do not use cov() function since it uses a single core only and is slower. that avoid expensive sortings. 9 CONCLUSION July 2018. In this paper, we targeted applications that store data in re- [11] Joseph Vinish D’silva, Florestan De Moor, and BeŠina Kemme. Keep your host language object and also query it: A case for SQL query lations and must deal with queries that mix relational and support in RDBMS for host language objects. In Carlos Maltzahn and linear algebra operations. We proposed the relational matrix Tanu Malik, editors, Proceedings of the 31st International Conference algebra (RMA), an extension of the relational model with on Scientific and Statistical Database Management, SSDBM 2019, Santa matrix operations that maintain important contextual infor- Cruz, CA, USA, July 23-25, 2019, pages 133–144. ACM, 2019. mation. RMA operations are defined over relations and can [12] W. Gander. Algorithms for the QR-Decomposition. Technical report, ETH Zurich, April 1980. be nested. We implemented RMA over the internal data [13] A. Ghoting, R. Krishnamurthy, E. Pednault, B. Reinwald, V. Sind- structures of MonetDB and do not require changes in the hwani, S. Tatikonda, Y. Tian, and S. Vaithyanathan. SystemML: query processing pipeline. Our integration is competitive Declarative machine learning on MapReduce. In Proceedings of the with state-of-the-art approaches and excels for mixed work- 2011 IEEE 27th International Conference on Data Engineering, ICDE loads. ’11, pages 231–242, Washington, DC, USA, 2011. IEEE Computer So- ciety. RMA opens new opportunities for cross algebra optimiza- [14] G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins tions that involve both relational and linear algebra opera- University Press, 3rd edition, 1996. tions. It also is interesting to investigate the handling of [15] J. M. Hellerstein, C. Re, F. Schoppmann, D. Z. Wang, E. Fratkin, wide tables, e.g., by storing them as skinny tables that are A. Gorajek, K. S. Ng, C. Welton, X. Feng, K. Li, and A. Kumar. Œe accessed accordingly or by combining operations to avoid MADlib Analytics Library: Or MAD Skills, the SQL. Proc. VLDB En- dow., 5(12):1700–1711, August 2012. the generation of wide intermediate tables. [16] D. Hutchison, B. Howe, and D. Suciu. Laradb: A minimalist kernel for linear and relational algebra computation. CoRR, abs/1703.07342, ACKNOWLEDGMENTS 2017. [17] Kaggle Inc. BIXI Montreal (public bicycle sharing system). We thank Joseph Vinish D’silva for providing the source hŠps://www.kaggle.com/aubertsigouin/biximtl, 2019. code of AIDA and helping out with the system. We thank [18] R.A. Johnson and D.W. Wichern. Applied Multivariate Statistical Anal- the MonetDB team for their support. We thank Roland KwiŠ ysis. Applied Multivariate Statistical Analysis. Pearson Prentice Hall, for the discussion about application scenarios. Œe project 2007. [19] S. Luo, Z. J. Gao, M. Gubanov, L. L. Perez, and C. Jermaine. Scalable was partially supported by the Swiss National Science Foun- linear algebra on a system. In 2017 IEEE 33rd Inter- dation (SNSF) through project number 407550 167177. national Conference on Data Engineering (ICDE), pages 523–534, April 2017. REFERENCES [20] W. Mckinney. Pandas: a foundational python library for and statistics. 2011. [1] cuBLAS – NVIDIA BLAS library for GPUs. [21] D. Misev and P. Baumann. Homogenizing data and retrieval hŠps://developer.nvidia.com/cublas. in scientific applications. In Proceedings of the ACM Eighteenth Inter- [2] MKL – Intel Math Kernel Library. national Workshop on Data Warehousing and OLAP, DOLAP ’15, pages hŠps://so‰ware.intel.com/en-us/mkl. 25–34, New York, NY, USA, 2015. ACM. [3] NumPy. hŠp://www.numpy.org, 2018. [22] MonetDB. Online MonetDB reference. [4] C. R. Aberger, A. Lamb, K. Olukotun, and C. Re. Levelheaded: A hŠps://www..org/Home, 2017. unified engine for and linear algebra querying. [23] University of Trier. DBLP computer science bibliography. In ICDE. IEEE, 2018. hŠps://dblp.uni-trier.de, 2019. [5] C. R. Aberger, S. Tu, K. Olukotun, and C. Re. Emptyheaded: A re- [24] Oracle. Online Oracle UTL NLA Package Reference. lational engine for graph processing. In Proceedings of the 2016 In- hŠps://docs.oracle.com/database/121/ARPLS/u nla.htm, 2016. ternational Conference on Management of Data, SIGMOD ’16, pages [25] C. Ordonez. Building statistical models and scoring with UDFs. In Pro- 431–446, New York, NY, USA, 2016. ACM. ceedings of the 2007 ACM SIGMOD International Conference on Man- [6] P. Baumann, A. Dehmel, P. Furtado, R. Ritsch, and N. Widmann. Œe agement of Data, SIGMOD ’07, pages 1005–1016, New York, NY, USA, Multidimensional Database System RasDaMan. SIGMOD Rec., 27(2), 2007. ACM. June 1998. [26] R project. Œe R Project for Statistical Computing. [7] M. Boehm, A. Kumar, J. Yang, and H. V. Jagadish. hŠps://www.r-project.org/, 2018. in Machine Learning Systems. 2019. [27] ‹ick R. R Matrix Algbera package overview. [8] Z. Cai, Z. Vagena, L. Perez, S. Arumugam, P. J. Haas, and C. Jermaine. hŠp://www.statmethods.net/advstats/matrix.html, 2017. Simulation of database-valued Markov chains using SimSQL. In Pro- [28] C. Radhakrishna Rao and H. Toutenburg. Linear Models, Least Squares ceedings of the 2013 ACM SIGMOD International Conference on Man- and Alternatives. Springer Series in Statistics. Springer-Verlag New agement of Data, SIGMOD ’13, pages 637–648, New York, NY, USA, York, 1995. 2013. ACM. [29] Inc. SciDB. SciDB User’s Guide Version 13.3.6203. In SciDB User’s [9] O. Dolmatova, N. Augsten, and M. H. Boehlen. Preserving contextual Guide, pages 1–258, 2013. information in relational matrix operations. In In Proceedings of the [30] M. Stonebraker, P. Brown, A. Poliakov, and S. Raman. Œe Architec- 36th International Conference on Data Engineering, ICDE ’20, page 4 ture of SciDB. In Proceedings of the 23rd International Conference on pages. Scientific and Statistical Database Management, SSDBM’11, pages 1– [10] J. V. D’silva, F. De Moor, and B. Kemme. AIDA: Abstraction for ad- 16. Springer-Verlag, 2011. vanced in-database analytics. Proc. VLDB Endow., 11(11):1400–1413, [31] Y. Zhang, H. Herodotou, and J. Yang. RIOT: I/O-efficient numerical computing without SQL. CoRR, abs/0909.1766, 2009. [32] Y. Zhang, M. Kersten, M. Ivanova, and N. Nes. SciQL: Bridging the Gap Between Science and Relational DBMS. In Proceedings of the 15th Symposium on International Database Engineering & Applications, IDEAS ’11. ACM, 2011. [33] Y. Zhang, M. Kersten, and S. Manegold. SciQL: Array Data Process- ing Inside an RDBMS. In Proceedings of the 2013 ACM SIGMOD In- ternational Conference on Management of Data, SIGMOD ’13, pages 1049–1052. ACM, 2013.