<<

Self-organizing Tuple Reconstruction in Column-stores

Stratos Idreos Martin L. Kersten Stefan Manegold CWI Amsterdam CWI Amsterdam CWI Amsterdam The Netherlands The Netherlands The Netherlands [email protected] [email protected] [email protected]

ABSTRACT 1. INTRODUCTION Column-stores gained popularity as a promising physical de- A prime feature of column-stores is to provide improved sign alternative. Each attribute of a is physically performance over row-stores in the case that workloads re- stored as a separate column allowing queries to load only quire only a few attributes of wide tables at a time. Each the required attributes. The overhead incurred is on-the-fly relation R is physically stored as a of columns; one col- tuple reconstruction for multi-attribute queries. Each tu- umn for each attribute of R. This way, a query needs to load ple reconstruction is a join of two columns based on tuple only the required attributes from each relevant relation. IDs, making it a significant cost component. The ultimate This happens at the expense of requiring explicit (partial) physical design is to have multiple presorted copies of each tuple reconstruction in case multiple attributes are required. base table such that tuples are already appropriately orga- Each tuple reconstruction is a join between two columns nized in multiple different orders across the various columns. based on tuple IDs/positions and becomes a significant cost This requires the ability to predict the workload, idle time component in column-stores especially for multi-attribute to prepare, and infrequent updates. queries [2, 6, 10]. Whenever possible, position-based join- In this paper, we propose a novel design, partial side- matching and sequential data access are exploited. For each ways cracking, that minimizes the tuple reconstruction cost relation Ri in a query plan q, a column-store needs to per- in a self-organizing way. It achieves performance similar form at least Ni − 1 tuple reconstruction operations for Ri to using presorted data, but without requiring the heavy within q, given that Ni attributes of Ri participate in q. initial presorting step itself. Instead, it handles dynamic, Column-stores perform tuple reconstruction in two ways [2]. unpredictable workloads with no idle time and frequent up- With early tuple reconstruction, the required attributes are dates. Auxiliary dynamic data structures, called cracker glued together as early as possible, i.e., while the columns maps, provide a direct mapping between pairs of attributes are loaded, leveraging N-ary processing to evaluate the query. used together in queries for tuple reconstruction. A map On the other hand, late tuple reconstruction exploits the is continuously physically reorganized as an integral part of column-store architecture to its maximum. During query query evaluation, providing faster and reduced data access processing, “reconstruction” merely refers to getting the at- for future queries. To enable flexible and self-organizing be- tribute values of qualifying tuples from their base columns as havior in storage-limited environments, maps are material- late as possible, i.e., only once an attribute is required in the ized only partially as demanded by the workload. Each map query plan. This approach allows the query engine to exploit is a collection of separate chunks that are individually reor- CPU- and cache-optimized vector-like operator implementa- ganized, dropped or recreated as needed. We implemented tions throughout the whole query evaluation. N-ary tuples partial sideways cracking in an open-source column-store. A are formed only once the final result is delivered. detailed experimental analysis demonstrates that it brings Like most modern column-stores [12, 4, 15], we focus on significant performance benefits for multi-attribute queries. late reconstruction. Comparing early and late reconstruc- Categories and Subject Descriptors: H.2 [DATABASE tion, the educative analysis in [2] observes that the latter MANAGEMENT]: Physical Design - Systems incurs the overhead of reconstructing a column more than once, in case it occurs more than once in a query. Further- General Terms: Algorithms, Performance, Design more, exploiting sequential access patterns during recon- Keywords: Database Cracking, Self-organization struction is not always possible since many operators (joins, group by, order by etc.) are not tuple order-preserving. The ultimate access pattern is to have multiple copies for each relation R, such that each copy is presorted on an other attribute in R. All tuple reconstructions of R attributes initiated by a restriction on an attribute A can be performed Permission to make digital or hard copies of all or part of this work for using the copy that is sorted on A. This way, the tuple personal or classroom use is granted without fee provided that copies are reconstruction does not only exploit sequential access, but not made or distributed for profit or commercial advantage and that copies also benefits from focused access to only a small consecutive bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific area in the base column (as defined by the restriction on A) permission and/or a fee. rather than scattered access to the whole column. However, SIGMOD’09, June 29–July 2, 2009, Providence, Rhode Island, USA. such a direction requires the ability to predict the workload Copyright 2009 ACM 978-1-60558-551-2/09/06 ...$5.00.

1 and the luxury of idle time to prepare the physical design. lection of Binary Association Tables (BATs). Each BAT is In addition, up to date there is no efficient way to maintain a set of two columns. For a relation R of k attributes, there multiple sorted copies under updates in a column-store; thus exist k BATs, each BAT storing the respective attribute as it requires read-only or infrequently updated environments. (key,attr) pairs. The system-generated key identifies the In this paper, we propose a self-organizing direction that relational tuple that attribute value attr belongs to, i.e., all achieves performance similar to using presorted data, but attribute values of a single tuple are assigned the same key. comes without the hefty initial price tag of presorting it- Key values form a dense ascending representing the self. Instead, it handles dynamic unpredictable workloads position of an attribute value in the column. Thus, for base with frequent updates and with no need for idle time. Our BATs, the key column typically is a virtual non-materialized approach exploits database cracking [7, 8, 9] that sets a column. For each relational tuple t of R, all attributes of t promising direction towards continuous self-organization of are stored in the same position in their respective column data storage based on selections in incoming queries. representations. The position is determined by the inser- We introduce a novel design, partial sideways cracking, tion order of the tuples. This tuple-order alignment across that provides a self-organizing behavior for both selections all base columns allows the column-oriented system to per- and tuple reconstructions. It gracefully handles any form tuple reconstructions efficiently in the presence of tuple of complex multi-attribute query. It uses auxiliary self- order-preserving operators. Basically, the task boils down organizing data structures to materialize mappings between to a simple merge-like sequential scan over two columns, re- pairs of attributes used together in queries for tuple recon- sulting in low data access costs through all levels of modern struction. Based on the workload, these cracker maps are hierarchical memory systems. Let us go through some of the continuously kept aligned by being physically reorganized, basic operators of MonetDB’s two-column physical algebra. while processing queries, allowing the DBMS to handle tuple Operator select(A,v1,v2) searches all (key,attr) pairs reconstruction using cache-friendly access patterns. in base column A for attribute values between v1 and v2. For To enhance performance and adaptability, in particular in each qualifying attribute value, the key value (position) is environments with storage restrictions, cracker maps are im- included in the result. Since selections are mostly performed plemented as dynamic collections of separate chunks. This on base columns, the underlying implementation preserves enables flexible storage management by adaptively main- the key-order also in the intermediate results. taining only those chunks of a map that are required to Operator join(j1,j2) performs a join between attr1 of process incoming queries. Chunks adapt individually to the j1 and attr2 of j2. The result contains the qualifying query workload. Each chunk of a map is separately reorga- (key1,key2) pairs. In general, this operator can maintain nized, dropped if extra storage space is needed, or recreated the tuple order only for the outer join input. (entirely or in parts) if necessary. Similarly, groupby and orderby operators cannot main- We implemented partial sideways cracking on top of an tain tuple order for any of their inputs. open-source column-oriented DBMS, MonetDB1 [15]. The Operator reconstruct(A,r) returns all (key,attr) pairs paper presents an extensive experimental analysis using both of base column A at the positions specified by r. If r is the synthetic workloads and the TPC-H benchmark. It clearly result of a tuple order-preserving operator, then iterating shows that partial sideways cracking brings a self-organizing over r, it uses cache-friendly in-order positional lookups into behavior and significant benefits even in the presence of ran- A. Otherwise, it requires expensive random access patterns. dom workloads, storage restrictions and updates. The remainder of this paper is organized as follows. Sec- 2.2 Selection-based Cracking tion 2 provides the necessary background. Then, to enhance Let us now briefly recap selection cracking as introduced readability and fully cover the research space, we present in [7]. The first time an attribute A is required by a query, partial sideways cracking in two steps. Focusing on the tuple a copy of column A is created, called the cracker column reconstruction problem and neglecting storage restrictions C of A. Each selection operator on A triggers a range- at first, Section 3 introduces the basic sideways cracking A based physical reorganization of C based on the selection technique using fully materialized maps, accompanied with A of the current query. Each cracker column, has a cracker an extensive experimental analysis. Then, Section 4 extends index (AVL-tree) to maintain partitioning information. Fu- the basic approach with flexible storage management using ture queries benefit from the physically clustered data and partial maps. Detailed experiments demonstrate the signif- do not need to access the whole column. Cracking continues icant benefits over the initial full materialization approach. with every query. Thus, the system continuously refines its Then, Section 5 presents the benefits of sideways cracking “knowledge” about how values are spread in C . Physical with the TPC-H benchmark. Related work is discussed in A reorganization happens on C while the original column is Section 6, and Section 7 concludes the paper. A left as is, i.e., tuples are ordered according to their insertion sequence. This order is exploited for tuple reconstruction. 2. BACKGROUND The operator crackers.select(A,v1,v2) replaces the orig- This section briefly presents the experimentation plat- inal select operator. First, it creates CA if it does not exist. form, MonetDB (v 5.4), and the basics of database cracking. It searches the index of CA for the area where v1 and v2 fall. If the bounds do not exist, i.e., no query used them 2.1 A Column-oriented DBMS in the past, then CA is physically reorganized to cluster all MonetDB is a full fledged column-store using late tuple qualifying tuples into a contiguous area. reconstruction. Every relational table is represented as a col- The result is again a set of keys/positions. However, due to physical reorganization, cracker columns are no longer 1Partial sideways cracking is part of the latest release of aligned with base columns and consequently the selection MonetDB, available via http://monetdb.cwi.nl/ results are no longer ordered according to the tuple inser-

2 tion sequence. For queries with multiple selections, we need Initial state select B from R where 10=5 5 b3 crackers.select only for the first selection, introducing the 22 b6 7 b7 7 b7 Position 7 12 b1 crackers.rel select for all subsequent selections. It per- 7 b7 Position 7 12 b1 Piece 3 value >10 11 b12 forms the tasks of and in one go. 26 b8 Piece 2 value >10 11 b12 select reconstruct 4 b9 15 b5 15 b5 Position 9 The terminology “database cracking” reflects the fact that 2 b10 22 b6 16 b13 Piece 4 value >=15 24 b11 24 b11 24 b11 the database is conceptually cracked into pieces. It aims at Position 9 11 b12 26 b8 26 b8 Piece 3 value >=15 Position 11 unpredictable/dynamic environments. An extensive discus- 16 b13 16 b13 Piece 5 value >=17 22 b6 sion on the pros and cons of cracking against traditional in- dices can be found in [7, 8]. In [8], cracking has been shown to also maintain its properties under high-volume updates. Figure 1: A simple example

3. SIDEWAYS CRACKING holds. The system creates map MAB and cracks it into three In this section, we introduce the basic sideways cracking pieces based on the selection predicate. Via cracking the technique using fully materialized maps. To motivate and il- qualifying B values are already clustered together aligned lustrate the effect of our choices, we build up the architecture with the qualifying A values. Thus, no explicit join-like starting with simple examples before continuing with more operation is needed for tuple reconstruction; the tail column flexible and complex ones. The section closes with an exper- of the middle piece forms the query’s result. Then, a similar imental analysis of sideways cracking with full maps against second query arrives. From the index, we derive that (1) the selection cracking and non-cracking approaches. The the entire middle piece belongs to the result, and hence, (2) addition of adaptive storage management through partial only Pieces 1 and 3 must be analyzed and further cracked. sideways cracking is discussed and evaluated in Section 4. As more queries are being processed, the system “learns” — Section 5 shows the benefits on the TPC-H benchmark. purely based on incoming queries — more about how data is clustered, and hence, can reduce data access. 3.1 Basic Definitions 3.2 Multi- Queries We define a cracker map MAB as a two-column table over two attributes A and B of a relation R. Values of A are Let us now discuss queries with multiple tuple reconstruc- stored in the left column, while values of B are stored in the tion operations. We start with queries over a single selec- right column, called head and tail, respectively. Values of tion and multiple projections. Then, Section 3.3 discusses A and B in the same position of MAB belong to the same queries with multiple selections. relational tuple. All maps that have been created using A The Problem: Non-aligned Cracker Maps. A single- as head are collected in the map set SA of R.A. Maps are selection query q that projects n attributes requires n maps, created on demand, only. For example, when a query q needs one for each attribute to be projected. These maps MAx be- access to attribute B based on a restriction on attribute A long to the same map set SA. However, a naive use of the and MAB does not exist, then q will create it by performing maps can lead to incorrect query results. Consider the ex- a scan over base columns A and B. For each cracker map ample depicted in the upper part of Figure 2. The first query MAB , there is a cracker index (AVL-tree) that maintains triggers the creation of MAB and it physically reorganizes information about how A values are distributed over MAB . it based on A < 3. Similarly, the second query triggers the Once a map MAB is created by a query q, it is used to creation of MAC and cracks it according to A<5. Then, the evaluate q and it stays alive to speed up data access in future third query needs both MAB and MAC . Refining both maps queries that need to access B based on A. Each such query according to A < 4 of the third query creates correct con- triggers physical reorganization (cracking) of MAB based on secutive results for each map individually. However, since the restriction applied to A. Reorganization happens in such these maps had previously been cracked independently using a way, that all tuples with values of A that qualify the re- different restrictions on A, the result tuples are not position- striction, are in a contiguous area in MAB . We use the two ally aligned anymore, prohibiting efficient positional result algorithms of [7] to physically reorganize maps by splitting tuple reconstruction. Maintaining the tuple identity explic- a piece of a map into two or three new pieces. itly by adding a key column to the maps is not an efficient We introduce sideways.select(A,v1,v2,B) as a new se- solution, either. It increases the storage requirements and lection operator that returns tuples of attribute B of relation allows only expensive join-like tuple reconstruction requiring R based on a predicate on attribute A of R as follows: random access due to non-aligned maps. The Solution: Adaptive Alignment. To overcome (1) If there is no cracker map MAB , then create one. this problem, we extend the sideways.select operator with (2) Search the index of MAB to find the contiguous area w of the pieces related to the restriction σ on A. an alignment step that adaptively and on demand restores If σ does not match existing piece boundaries, the alignment of all maps used in a query plan. The basic (3) Physically reorganize w to move false hits out of the idea is to apply all physical reorganizations, due to selections contiguous area of qualifying tuples. on an attribute A, in the same order to all maps in S . Due (4) Update the cracker index of MAB accordingly. A (5) Return a non-materialized view of the tail of w. to the deterministic behavior of the cracking algorithms [7], this approach ensures alignment of the respective maps. Example. Assume a relation R(A, B) shown in Figure 1. Obviously, in an unpredictable environment with no idle The first query requests values of B where a restriction on A system time, we want to invest in this extra work only if it

3 Initial state select B from R where A<3 select C from R where A<5 select B,C from R where A<4 A B C Crack A<3 Crack A<5 Crack A<4 Crack A<4 7 b1 c1 MAB Result MAC Result MAB MAC Result 4 b2 c2 2 b4 2 b4 3 c6 b4 c6 1 b3 c3 v<3 b4 3 c6 c6 v<3 1 b3 1 b3 v<4 2 c4 b3 c4 2 b4 c4 b3 v<5 4 c2 c2 4 b2 1 c3 3 b6 1 c3 b6 c3 8 b5 c5 c3 v>=3 7 b1 7 b1 v>=4 4 c2 3 b6 c6 2 c4 c4 v>=3 8 b5 8 c5 8 b5 8 c5 6 b7 c7 v>=4 3 b6 v>=5 7 c1 4 b2 v>=5 7 c1 Wrong Without 6 b7 6 c7 6 b7 6 c7 allignment Alignment Crack A<3 Crack A<3 Crack A<5 Crack A<5 Crack A<4 Crack A<4 M MAB Result AC MAC Result MAB MAB MAC Result 2 b4 2 c4 2 c4 2 b4 v<3 b4 v<3 v<3 c4 v<3 2 b4 v<3 v<3 2 c4 b4 c4 1 b3 1 c3 1 c3 b3 c3 1 b3 1 b3 1 c3 b3 c3 With 4 b2 4 c2 4 c2 v>=3 v>=3 v>=3 c2 v>=3 4 b2 3 b6 3 c6 b6 c6 Alignment 7 b1 7 c1 3 c6 c6 3 b6 v>=4 4 b2 v>=4 4 c2 8 b5 8 c5 v>=3 v>=3 8 c5 8 b5 8 b5 8 c5 3 b6 3 c6 v>=5 v>=5 7 c1 v>=5 7 b1 7 b1 v>=5 7 c1 Correct 6 b7 6 c7 6 c7 6 b7 6 b7 6 c7 allignment

Figure 2: Multiple tuple reconstructions in multi-projection queries pays-back, i.e., only once a map is required. In fact, per- With the maps aligned and holding the projection attributes forming alignment on-line is not an option. On-line align- in the tails, the result is readily available. The bottom part ment would mean that every time we crack a map, we also of Figure 2 demonstrates how queries are evaluated using forward this cracking to the rest of the maps in its set. This aligned maps yielding the correctly aligned result. is prohibitive for several reasons. First, in order to be able Sideways cracking performs tuple reconstruction by ef- to align all maps in one go we need to actually materialize ficiently maintaining aligned maps via cracking instead of and maintain all possible maps of a set, even the ones that using (random-access) position-based joins. the actual workload does not require. Most importantly ev- The alignment step follows the self-organizing nature of a ery query would have to touch all maps of a set, i.e., all cracking DBMS. Aligning a map M becomes less expensive attributes of the given relation. This immediately overshad- the more queries use M, as incremental cracking successively ows the benefit of using a column-store in touching only reduces the size of pieces and hence the data that needs to the relevant attributes every time. The overhead of hav- be accessed. Moreover, the more frequently M is used, the ing adaptive alignment is that each map MAx in a set SA fewer alignment steps are required per query to bring it up- needs to materialize the head attribute A so that MAx can to-date. Unused maps do not produce any processing costs. be cracked independently. We will remove this restriction with partial sideways cracking in the next section. 3.3 Multi-selection Queries To achieve adaptive alignment, we introduce a cracker The final step is to generalize sideways cracking for queries tape TA for each set SA, which logs (in order of their occur- rence) all selections on attribute A that trigger cracking of that select over multiple attributes. One approach is to cre- ate wider maps that include multiple attributes in differ- any map in SA. Each map MAx is equipped with a cursor ent orderings. However, the many combinations, orderings pointing to the entry in TA that represents the last crack and predicates in queries lead to huge storage and main- on MAx. Given a tape TA, a map MAx is aligned (synchro- nized) by successively forwarding its cursor towards the end tenance requirements. Furthermore, wider maps are not compatible with the cracker algorithms of [7] and the up- of TA and incrementally cracking MAx according to all se- lections it passes on its way. All maps whose cursors date techniques of [8]. Instead, we propose a solution that exploits aligned two-column maps in a way that enables ef- to the same position in TA, are physically aligned. To ensure that alignment is performed on demand only, we ficient cache-friendly operations. integrate it into query processing. When a query q needs a The Problem: Non-aligned Map Sets. Let us first map M, then and only then, q aligns M. We further extend consider conjunctive queries, e.g., the query of Figure 3. A the sideways.select(A,v1,v2,B) operator with three new query plan could use maps MAD, MBD and MCD. These steps that maintain and use the cracker tapes as follows: maps belong to different sets SA, SB and SC , respectively. However, the alignment techniques presented before apply (1) If there is no TA, then create an empty one. only to multiple maps within the same set. Keeping maps of (2) If there is no cracker map M , then create one. AB different sets aligned is not possible at all, as each attribute (3) Align MAB using TA. (4) Search the index of MAB to find the contiguous area w of requires/determines its own individual order for its maps. the pieces related to the restriction σ on A. Thus, using the above map sets for the example query inher- If σ does not match existing piece boundaries, ently yields non-aligned individual selection results requiring (5) Physically reorganize w to move false hits out of the contiguous area of qualifying tuples. expensive operations for subsequent tuple reconstructions. (6) Update the cracker index of MAB accordingly. The Solution: Use a Single Aligned Set. The chal- (7) Append predicate v1 < A < v2 to TA. (8) Return a non-materialized view of the tail of w. lenge for multi-selections is to find a solution that uses maps of only one single set, and thus can exploit their alignment. For a query with one selection and k projections, the query We postpone the discussion about how to choose this one plan contains k sideways.select operators, one for each set till later in this section. To sketch our approach using projection attribute. For example, assume a query that se- the query of Figure 3 as example, we arbitrarily assume that lects on A and projects B and C. Then, one sideways.select set SA is chosen, i.e., we use maps MAB , MAC and MAD. operator will operate over MAB and another over MAC . Each map is first aligned to the most recent crack opera-

4 Query Initial state select_create_bv(A,3,10,B,4,8) select_refine_bv(A,3,10,C,1,7,bv) reconstruct(A,3,10,D,bv) A B C D MAB MAC MAD 12 9 3 9 2 8 Bit 2 17 Bit Bit 2 8 v<=3 v<=3 Result select D from R 3 2 6 4 3 2 vector bv 3 6 vector bv vector bv v<=3 3 4 where 33 v>3 8 6 1 1 8 12 22 11 9 19 7 3 0 7 3 0 0 v>3 7 1 7 16 12 3 7 16 0 7 12 0 0 7 3 26 2 2 6 4 5 1 4 11 0 0 4 5 4 5 11 5 26 2 26 2 26 6 2 8 17 8 v>=10 12 9 v>=10 12 3 v>=10 12 9 7 3 3 1 22 11 22 9 22 19

Figure 3: Multiple tuple reconstructions in multi-selection queries tion on A and only then it is cracked given the current predi- the exact number of tuples in a given value range. Effec- cate on A. Given the conjunctive predicate, we know that we tively, we can view a cracker index as a self-organizing his- just created contiguous areas wB , wC and wD aligned across togram. In order to estimate the result size of a selection the involved maps that contain all result candidates. These over an attribute A, any available map in SA can be used. areas are aligned since all maps were first aligned and then In case of alternatives, the most aligned map is chosen by cracked based on the same predicate. Thus, all areas have looking at the distance of its cursor to the last position of also the same size k. To filter out the “false candidates” that TA. The bigger this distance, the less aligned a map is. A fulfill the predicate on A, but not all other predicates, we more aligned/cracked map can lead to a more accurate es- use bit vector processing (X100 [4] and the study of [2] also timation. Using the cracker index of the chosen map MAx, exploit bit-vectors for filtering multiple predicates). Using a we locate the contiguous area w that contains the result single bit vector of size k, if a tuple fulfills the predicate on tuples. In case the predicate on A matches with the bound- B, the respective bit is set, otherwise cleared. Successively aries of existing pieces in MAx, the result size is equal to iterating over the aligned result areas in the remaining maps the size |w| of w. Otherwise, we assume that w consists Pn (wC in our example), the bits of tuples that do not fulfill the of n pieces W1,...,Wn, and derive |w| = i=1 |Wi| and respective predicate are cleared. Finally, the bit vector in- 0 Pn−1 |w | = i=2 |Wi| as upper and lower bounds respectively. dicates which wD tuples form the result. An example in We can further tighten these bounds by estimating the qual- Figure 3 illustrates the details using the following three new ifying tuples in W1 and Wn, e.g., using interpolation. operators. Disjunctive Queries. Disjunctive queries are handled in a symmetrical way. This time the first selection creates sideways.select create bv(A,v1,v2,B,v3,v4) a bit vector with size equal to the size of the map and not (1-7) Equal to sideways.select in Section 3.2. to the size of the cracked area w (as with conjunctions). (8) Create and return bit vector bv for w with v < B < v . 3 4 The rest of the selections need to analyze the areas outside sideways.select refine bv(A,v1,v2,B,v3,v4,bv) w for any unmarked tuples that might qualify and refine (1-7) Equal to sideways.select in Section 3.2. the bit vector accordingly. The choice of the map set is (8) Refine bit vector bv with v3 < B < v4 and return bv. again symmetric; we choose a set based on the least selective attribute. In this way, the areas that need to be analyzed sideways.reconstruct(A,v1,v2,B,bv) outside the cracked area are as small as possible. (1-7) Equal to sideways.select in Section 3.2. (8) Create and return a result that contains the tail value 3.4 Complex Queries of all tuples from w in MAB whose bit is set in bv. Until now we studied multi-selections/projections queries. Given the alignment of the maps and the bit vector, only The rest of the operators are not affected by the physical positional lookups and sequential access patterns are in- reorganization step of cracking as no other operator, other volved. In addition, by clustering and aligning relevant data than tuple reconstruction, depends on tuple insertion order. via cracking, the system needs to analyze only a small por- Thus, joins, aggregations, groupings etc. are all performed tion of the involved columns (equal to the size of the bit efficiently using the original column-store operators (e.g., see vector) for selections and tuple reconstructions. our experimental analysis). Potentially, many operators can Map Set Choice: Self-organizing Histograms. The exploit the clustering information in the maps, e.g., a max remaining issue is to determine the appropriate map set. can consider only the last piece of a map or a join can be Our approach is based on the core of the “cracking philos- performed in a partitioned like way exploiting disjoint ranges ophy”, i.e., in an unpredictable environment with no idle in the input maps. We leave such directions for future work system time, always perform the minimum investment. Do consideration as they go beyond the scope of this paper. just enough operations to boost the current query. Do not invest in the future unless the benefit is known, or there is 3.5 Updates the luxury of time and resources to do so. In this way, for a Update algorithms for a cracking DBMS have been pro- query q, a set SA is chosen such that the restriction on A is posed and analyzed in detail in [8]. An update is not applied the most selective in q, yielding a minimal bit vector size in immediately. Instead, it remains as a pending update and order to load and analyze less data in this query plan. it is applied only when a query needs the relevant data as- The most selective restriction can be found using the cracker sisting the self-organizing behavior. This way, updates are indices, for they maintain knowledge about how values are applied while processing queries and affect only those tuples spread over a map. The size of the various pieces gives relevant to the query at hand. For each cracker column,

5 there exist a pending insertions and a pending deletions col- (a) Exp1: Multiple tuple reconstructions 3 (b) Exp2: Varying selectivity 800 1.5 umn. An update is merely translated into a deletion and an Presorted MonetDB 1 insertion. Updates are applied/merged in a cracker column Sideways Cracking without destroying the knowledge of its cracker index which MonetDB Selection Cracking offers continual reduced data access after an update. 600 Sideways cracking is compatible with [8] as follows. Each map MAB has a pending insertions table holding (A,B) 400 pairs. Insertions are handled independently and on demand 0.1 MonetDB for each map using the Ripple algorithm [8]. The extension Sid. point is that the first time an insertion is applied on a map of set Sid. 10% Sid. 30% S , it is also logged in tape T so that the rest of the S 200

A A A Response time (milli secs) Sid. 50% maps can apply the insertions in the correct order during Sid. 70% Sid. 90% alignment. For deletions we only need one pending dele- Response time relative to MonetDB tions column for each set SA as we only need (A,key) pairs 0 0.01 to identify a deletion. Since maps do not contain the tuple 2 4 8 1 10 100 1000 # of tuple reconstructions Query sequence keys, as cracker columns do, we maintain a map MAkey for each set SA. This map, when aligned and combined with Figure 4: Improving tuple reconstruction the pending deletions column, gives the positions of the rel- evant deletes for the current query in the currently aligned construction costs. With MonetDB, the select operator is maps. The Ripple algorithm [8] is used to move deletes out order-preserving, hence, tuple reconstruction is performed of the result area of the maps used in a plan. using in-order positional key-lookups into the projection at- tribute’s base column. The resulting sequential access pat- 3.6 Experimental Analysis tern is very cache-friendly ensuring that each page or cache- In this section, we present a detailed experimental analy- line is loaded at most once. On the contrary, with selection sis. We compare our implementation of selection and side- cracking, the result of crackers.select is no longer aligned ways cracking on top of MonetDB, against the latest non- with the base columns due to physical reorganization. Con- cracking version of MonetDB and against MonetDB on pre- sequently, the tuple reconstruction is performed using ran- sorted data. We use a 2.4 GHz AMD Athlon 64 processor domly ordered positional key-lookups into the base column. equipped with 2 GB RAM. The operating system is Fedora Lacking both spatial and temporal locality, the random ac- Core 8 (Linux 2.6.23). Unless mentioned otherwise, all ex- cess pattern causes significantly more cache-/page-misses, making tuple reconstruction more expensive. periments use a relational table of 9 attributes (A1 to A9), each containing 107 randomly distributed in [1, 107]. Exp2: Varying Selectivity. We repeat the previous Exp1: Varying Tuple Reconstructions. The first experiment for 2 tuple reconstructions, but this time we experiment demonstrates the behavior in query plans with vary selectivity factors from point queries up to 90% se- 3 one selection, but with multiple tuple reconstructions: lectivity. We run 10 queries selecting randomly located ranges/points. Figure 4(b) shows the response time relative (q1) select max(A2), max(A3) ... from R where v1

6 Exp4: (a) Total Cost (b) Select and TR cost before join (c) TR cost after join 1e+06 1000 1000 Selection Cracking MonetDB Selection Cracking 2500 MonetDB

800 800

2000 1e+05 Selection MonetDB Selection Cracking Cracking 600 600 1500 MonetDB 400 400 1000 Sideways Cracking Sideways 1e+04 Sideways Cracking Cracking Sideways Cracking Presorted MonetDB Presorted 500 200 200 Presorted MonetDB

Response time (milli secs) Presorting cost=12 secs MonetDB Response time (micro secs) Presorting cost=3.5 secs Presorted MonetDB 0 0 0 1e+03 1 10 100 1 10 100 1 10 100 1 10 100 1000 Query sequence Query sequence Query sequence Query sequence

Figure 5: Join queries with multiple selections and tuple reconstructions (TR) Figure 6: Skewed workload reconstruction performance as purely sequential access at a Exp6: (a) LFHV scenario (b) HFLV scenario 1e+06 lower investment than sorting. We see that the investment in clustering (sorting) pays off with 4 (8) or more projec- Non-cracking MonetDB Non-cracking MonetDB tions. In this way, reordering intermediate results pays off when multiple projections share a single intermediate result. Selection Selection However, it is not beneficial with only a few projections or Cracking Cracking with multiple selections, where individual intermediate re- sults prohibit the sharing of reordering investments. Also, as Sideways Cracking Sideways Cracking seen in Figure 4 presorted MonetDB and sideways cracking significantly outperform plain MonetDB even with just a few Response time (micro secs) 1e+05 tuple reconstructions by having columns already aligned. 1 10 100 1000 10000 1 10 100 1000 10000 Exp4: Join Queries. We proceed with join queries that Query sequence Query sequence both select and project over multiple attributes. Two tables of 7 attributes and the following query are used. Figure 7: Effect of updates select max(R1),max(R2),max(S1),max(S2) from R,S where v1

7 Initial state Select B from R whe re 9

1 15 b1 c1 7 14 v<5 4 11 v<5, E v<5 4 11 2 8 b2 c2 8 2 U 2 6 2 6 7 b14 U 3 19 b3 c3 9 12 7 14 v>=5 7 14 v<8, E v<=9 v<=9 v<=9 v>=5 6 b11 4 6 b4 c4 6 4 6 11 M,C=2 v>=5 6 11 U E E F 5 b8 5 11 b5 c5 4 11 5 8 F 5 8 v>=8 9 c12 6 2 b6 c6 2 6 9 12 9 12 v>=8 v>=8, E v>=8 M,C=3 8 c2 7 14 b7 c7 5 8 U 8 2 8 2 14 b7 v>9 11 b5 F v>9 8 5 b8 c8 14 7 14 7 v>9 11 b5 14 7 11 c5 12 b9 M,C=1 12 b9 M,C=1 9 12 b9 c9 v>9 12 9 v>9 12 9 M,C=1 12 b9 12 9 12 c9 13 b13 13 b13 10 18 b10 c10 F 13 13 M,C=1 v>9 13 13 13 b13 v>9 13 13 v>=13 13 c13 15 b1 v>=13 15 b1 11 4 b11 c11 15 1 F 15 1 v>=13 15 b1 F 15 1 M,C=1 14 c7 11 b5 M,C=1 14 b7 12 9 b12 c12 11 5 11 5 M,C=1 14 b7 11 5 v>15 15 c1 13 13 b13 c13 19 3 19 3 19 3 M,C=1 v>15 v>15 v>15 v>15 v>15 14 7 b14 c14 18 10 18 10 v>15, E 18 10 U E E U U v>15, E

Figure 8: Using partial maps (U=Unfetched, F=Fetched, E=Empty, M=Materialized, C=ChunkID) touch the non-hot area, in a self-organizing way, sideways map for a set S, is created along with the creation of the cracking improves performance also for the non-hot set. first chunk of the first partial map in S. Exp6: Updates. Two scenarios are considered for up- An area w of a chunk map is defined as fetched if at least dates, (a) the high frequency low volume scenario (HFLV); one partial map has fetched all tuples of w to create a new every 10 queries we get 10 random updates and (b) the low chunk. Otherwise, w is called unfetched. Similarly, an area frequency high volume scenario (LFHV); every 103 queries c of a partial map is defined as materialized if this map has 3 we get 10 random updates. Random q3 queries are used. created a chunk for c. Otherwise, c is called empty. Figure 8 Figure 7 shows that sideways cracking maintains high per- shows some simple examples. formance and a self-organizing behavior through the whole For each fetched area w, the index of a chunk map main- sequence of queries and updates demonstrating similar per- tains (i) a list of references to w, i.e., the IDs of the partial formance as in [8]. We do not run on presorted data, here, maps that currently hold a chunk created by fetching w, and since to the best of our knowledge there is no efficient way (ii) a tape where all the cracks that happen on the chunks to maintain multiple sorted copies under frequent updates created by w are logged. If all these chunks are dropped in column-stores [6]. This is an open research problem. Ob- (discussed below under “Storage Management”), then w is viously, resorting all copies with every update is prohibitive. marked again as unfetched and its tape is removed. Creating Chunks. New chunks for a map MAx are cre- 4. PARTIAL SIDEWAYS CRACKING ated on demand, i.e., each time a query q needs tuples from an empty area c of MAx. The area c corresponds to an area The previous section demonstrated that sideways cracking w of HA. We distinguish two cases depending on whether w enables a column-store to efficiently handle multi-attribute is fetched or not. Firstly, if w is unfetched, then currently queries. It achieves similar performance to presorted data no other map in SA holds any chunks created from w. In but without the heavy initial cost and the restrictions on up- this case, depending on the value range that q requires, we dates and workload prediction. So far, we assumed that no either make a new chunk using all tuples of w or crack w in storage restrictions apply. As any other indexing or caching smaller areas to materialize only the relevant area (see ex- mechanism, sideways cracking imposes a storage overhead. amples in Figure 8). Secondly, in case w is already marked This section addresses this issue via partial sideways crack- as fetched, it must not be cracked further, as this might lead ing. An extensive experimental analysis shows that it signif- to incorrect alignment as described in Section 3.2. For ex- icantly improves performance under storage restrictions and ample, if multiple maps are used by a single query q that enables efficient workload adaptation by partial alignment. requires chunks created from an area w, then these chunks will not be aligned if created by differently cracked instances 4.1 Partial Maps of w. Hence, a new chunk is created using all tuples in w. The motivation for partial maps comes from a divide and To actually create a new chunk for a map MAB , we use the conquer approach. The main concepts are the following. (1) keys stored in w to get the B values from B’s base column. Maps are only partially materialized driven by the workload. Storage Management. A partial map is an auxiliary (2) A map consists of several chunks. (3) Each chunk is a data structure, i.e., without loss of primary information, any separate two-column table and (4) contains a given value chunk of any map can be dropped at any time, if storage range of the head attribute of this map. (5) Each chunk space is needed, e.g., for new chunks. In the current imple- is treated independently, i.e., it is cracked separately and it mentation, chunks are dropped based on how often queries has its own tape. Figure 8 illustrates a simplified example. access them. After a chunk is dropped, it can be recreated Basic Definitions. A map set SA of an attribute A con- at any time, as a whole or only in parts, if the query work- sists of (a) a collection of partial maps and (b) a chunk map load requires it. This is a completely self-organizing behav- HA. HA contains A values along with the respective tu- ior. Assuming there is no idle time in between, no available ple key. Its role is to provide the partial maps of SA with storage, and no way to predict the future workload, this ap- any missing chunks when necessary. Each partial and chunk proach assures that the maximum available storage space is map has an AVL-tree based index to maintain partitioning exploited, and that the system always keeps the chunks that information. Different maps in the same set do not necessar- are really necessary for the workload hot-set. ily hold chunks for the same value ranges. A partial map is Before creating a new chunk, the system checks if there created when a query needs it for the first time. The chunk is sufficient storage available. If not, enough chunks are

8 dropped to make room for the new one. Dropping a chunk are updated only on demand. Thus, before making a new c involves operations to update the corresponding cracker chunk from an unfetched area w, w is updated if necessary. index I. To assist the learning behavior lazy deletion is Naturally, updates applied in a chunk map are also removed used, i.e., all nodes of I that refer to c are not removed but from the pending updates of all partial maps in this set. merely marked as deleted and hence can be reused when c (or parts of it) is recreated in the future. 4.2 Experimental Analysis Dropping the Head Column. The storage overhead We proceed with a detailed assessment of our partial side- is further reduced by dropping the head column of actively ways cracking implementation on top of MonetDB. Using used chunks, at the expense of loosing the ability of further the same platform as in Section 3.6, we show that partial cracking these chunks. We consider two opportunities. maps bring a significant improvement and a self-organizing First, we drop the head column of chunks that have been behavior under storage restrictions and during alignment. cracked to an extend that each piece fits into the CPU For storage management with full maps, we use the same cache. In case a future query requires further cracking of approach as for partial maps, i.e., existing maps are only such pieces, it is cheap to sort a piece within the CPU cache. dropped if there is not sufficient storage for newly requested This action is then logged in the tape to ensure future align- maps. We always drop the least frequently accessed map(s). ment with the corresponding chunks of other maps. For the experiments throughout this section we use a re- Second, we drop the head column of chunks that have lation with 11 attributes containing 106 tuples and 5 multi- not been cracked recently as queries use their pieces “as is”. attribute queries (Q , i ∈ {1,..., 5}) of the following form. Once we need to crack such a chunk c in the future, we only i (Q ) select C from R where v

9 (a) Unlimited storage (b) Limited storage: T=6.5M (c) Limited storage: T=2M (d) Storage usage: F(ull) vs. P(artial) 1e+06 Full maps Full maps Full maps 10 F, no T Partial maps Partial maps Partial maps P, no T

1e+05 8

P, T=6.5M 1e+04 6 F, T=6.5M

1e+03 4

F, T=2M 1e+02 2 P, T=2M Response time (micro secs)

1e+01 Storage used (# of tuples x1M) 0 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 1 2 3 4 5 6 7 8 9 10 Query sequence (x100) Query sequence (x100) Query sequence (x100) Query sequence (x100)

Figure 9: Efficient handling of storage restrictions with partial maps (S=10K)

(a) Random, Result size S=1K tuples (b) Skewed, Result size S=10K tuples (c) Storage usage: F(ull) vs. P(artial) Total cumulative costs (1000 queries) 7 10 1e+06 Full maps Full maps Partial maps F, rand., S=1K Partial maps 6 1e+05 F, skew, S=10K 5 F, no T P, no T F, T=6.5M P, T=6.5M F, T=2M P, T=2M 3 1e+04 4

3 1e+03 P, rand., S=1K 1 2 1e+02 1 Cumulative cost (seconds) Response time (micro secs)

Storage used (# of tuples x1M) P, skew, S=10K 1e+01 0 0.3 0 2 4 6 8 10 0 2 4 6 8 10 0 1 2 3 4 5 6 7 8 9 10 1K 10K 100K 300K Query sequence (x100) Query sequence (x100) Query sequence (x100) Result size S (# of tuples)

Figure 10: Efficient adaptation to the workload with partial maps (T=6.5M) Figure 11: No overhead

Varying workload change rate (S=10K, T=6M) Change workload: (a) every 10 queries (b) every 100 queries (c) every 200 queries 1e+05 1000 Full maps Full maps Full maps Full maps Partial maps Partial maps Partial maps

100 1e+04

10

1e+03 1

Partial maps Response time (micro secs)

Cumulative cost (s) for 1000 queries 0.1 1e+02 5 10 50 100 500 1000 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 # workload changes per 1000 queries Query sequence (x100) Query sequence (x100) Query sequence (x100)

Figure 12: Total costs Figure 13: Improving alignment with partial maps (S=10K, T=unlimited) workload. To simulate the skew, we force 9/10 queries to No Overhead in Query Sequence Cost. So far, we request random ranges from only 20% of the tuples while demonstrated that partial maps provide a more normalized the remaining queries request ranges from the rest of the per query performance compared to full maps. In addition, domain. Both runs use a storage threshold of T = 6.5∗106. Figure 11 shows that these benefits come for free. It depicts Figures 10(a) and (b) depict the results. Compared to the the total cost to process all queries in the basic experiment previous experiment, the workload is now focused on specific by varying both the selectivity and the storage threshold. data parts, either by more selective queries or by skew. In With 30% selectivity (S = 3 ∗ 105), both approaches have both cases, partial sideways cracking shows a self-organizing similar total cost while with more selective queries partial and normalized behavior without penalizing single queries as maps significantly outperform full maps. This behavior com- full maps do. Being restricted to handling complete maps bined with the more normalized per query performance gives (holding mostly unused data), full maps cannot take advan- a strong advantage to partial maps. The next experiment tage of the workload characteristics and suffer from lack of demonstrates that the advantage of partial maps over full storage. Figure 10(c) illustrates that full maps demand more maps increases with more frequent workload changes. storage and thus quickly hit the threshold. In contrast, par- Adapting to Frequently Changing Workloads. In tial maps exploit the available storage more efficiently and all previous experiments we assume a fixed rate of changing more effectively by materializing only the required chunks. workload, i.e., every 100 queries. Here we study the effect of

10 MonetDB presorted Selection Cracking Sideways Cracking MySQL presorted 27000 30000 1600 4000 26000 10000 1400 3500

7000 TPC-H Query 1 1100 TPC-H Query 3 1000 1071 TPC-H Query 4 300 743 TPC-H Query 6 1027 325

900 5000 900 200

700 3000 800 500 100

Response time (milli secs) 1000 300 700 20 50000 60000 10000 5000

5000 10000 2000 3000 TPC-H Query 7 TPC-H Query 8 927 TPC-H Query 10 1128 TPC-H Query 12 1400 500 850 700 894 1200 800 600 400 1000 750 500 800 700 400 300 600 650 300 400

Response time (milli secs) 300 200 600 200 23000 10000 2000 700 893 2475 TPC-H Query 19 792 TPC-H Query 20 22000 1000 TPC-H Query 14 764 TPC-H Query 15 600 900 330 1500 420 800 300 500 400 250 1000 400 200 300 150 300 500 100 200 200 70 150 160 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Response time (milli secs) Query sequence Query sequence Query sequence Query sequence

Figure 14: TPC-H results (“presorted” times exclude presorting costs; Q4,8,10: 3 min.; Q1,6,7,12,14,15,19,20: 11 min.; Q3: 14 min.) varying this parameter. We run the basic experiment with 5. TPC-H EXPERIMENTS 4 6 fixed S = 10 and T = 6∗10 but for various different rates In this section, we evaluate our implementation in real- of changing the workload. Figure 12 shows the total cost life scenarios using the TPC-H benchmark [14] (scale factor to process all queries for each case. The performance of full 1) on the same platform as in the previous experiments. maps faces a significant degradation as the workload changes We use the TPC-H queries that have at least one selection more often, causing maps to be dropped and recreated more on a non-string attribute, i.e., Queries 1, 3, 4, 6, 7, 8, 10, frequently. In contrast, due to flexible and adaptive chunk 12, 14, 15, 19, & 20 (cf., [14]). String cracking and base management, partial maps offer a stable high performance table joins exploiting the already partitioned cracker maps that decreases hardly with more frequent workload changes. are expected to yield significant improvements also for the Alignment Improvements. Let us now demonstrate remaining queries, but these are directions of future work the benefits of partial maps during alignment. We run the and complementary to this paper. For each query, we cre- 4 basic experiment for S = 10 . To concentrate purely on the ated a sequence of 30 parameter variations using the random alignment cost we use only two types of queries and assume query generator of the TPC-H release. For experiments on no storage restrictions. Figure 13 shows results for changing presorted data, we created copies of all relevant tables such the workload every 10, 100 or 200 queries. As we decrease that for each query there is a copy primarily sorted on its the rate of changing workloads, the peaks for full maps be- selection column and (where applicable) sub-sorted on its come less frequent, but higher. These peaks represent the group-by and/or order-by column. We use MySQL to show alignment cost. Each time the workload changes, the maps the effects of using presorted data on a row-store. used by the new batch of queries have to be aligned with Figure 14 shows the costs for each query sequence. Side- the cracks of the previous batch; the longer the batch, the ways cracking achieves similar performance to presorted Mon- more cracks, the higher the alignment costs. Partial maps etDB (ignoring the presorting cost). Depending on the query, do not suffer from the alignment cost. Being able to align the presorting cost is 3 to 14 minutes, while as seen in Fig- chunks only partially, and only those required for the current ure 14 the first sideways cracking query (in each query se- query, partial maps avoid penalizing single queries, bring- quence) is between 0.75 to 3 seconds. In a self-organizing ing a smoother behavior to the whole query sequence. Fur- way, sideways cracking continuously improves performance thermore, notice that as more queries are processed, partial without requiring the heavy initial step of presorting and maps gain more information to continuously increase align- workload knowledge. For most queries, it outperforms plain ment performance, assisting the self-organizing behavior. MonetDB as of the second run; for Queries 1 & 10, already

11 Q SiCr PrMo the first run is faster. The table on the completely and exhaustively. Cracking targets exactly the 1 64% 50% left summarizes the benefits of sideways opposite environments with continuous and sudden work- 3 44% -46% cracking (SiCr) and presorted MonetDB load shifts and updates [8]. Direct comparison with C-store 4 4% 6% (PrMo) over plain MonetDB on the tested was not possible as it does not provide a generic SQL in- 6 80% 83% TPC-H queries (Q). Having both efficient terface. However, we believe that our experiments against 7 62% 28% selections and tuple reconstructions, both MonetDB on presorted data give a fair comparison of side- 8 20% -36% 10 12% 9% sideways cracking and presorted MonetDB ways cracking against a presorted column-store. 12 41% 42% manage to significantly improve over plain A very interesting area is the opportunity to improve per- 14 19% 12% MonetDB especially for queries with mul- formance using compression. This naturally gives a boost 15 62% 60% tiple tuple reconstructions on large tables, in presorted data performance [1], while it is a promising 19 61% 61% e.g., Queries 1, 6, 7, 15, 19, 20. Queries research direction for database cracking too. 20 67% 65% with multiple non tuple-order-preserving operators (group by, order by, joins) and subsequent tuple 7. CONCLUSIONS reconstructions yield significant gains by restricting tuple In this paper, we introduce partial sideways cracking, a reconstructions to small column areas, e.g., Queries 1, 3, 7. key component in a self-organizing column-store based on Query 19 is an example where a significant amount of tuple physical reorganization in the critical path of query exe- reconstructions are needed as it contains a complex disjunc- cution. It enables efficient processing of complex multi- tive where clause. The column-store has to reconstruct each attribute queries by minimizing the costs of late tuple re- attribute multiple times to apply the different predicates construction, achieving performance competitive with using whereas the row-store processes the tables tuple-by-tuple. presorted data, but requiring neither an expensive prepara- Sideways cracking minimizes significantly this overhead pro- tion step nor a priori workload knowledge. With its flexi- viding a comparable performance to the row-store. ble and adaptive chunk-wise architecture it yields significant In certain cases (e.g., Queries 3, 7, 8), cracking manages gains and a clear self-organizing behavior even under ran- to even outperform presorted MonetDB. The TPC-H data dom workloads, storage restrictions, and updates. comes already presorted on the keys of the Order table. Database cracking has only scratched the surface of this Plain MonetDB (and MySQL) exploit the sorted keys espe- promising direction for self-organizing DBMSs. The research cially during joins (most queries join on Order keys). Fully agenda includes calls for innovations on cracker-joins, com- sorting on the selection attribute completely destroys this pression, aggregation, distribution and partitioning, as well order, making the presorted case even slower than the orig- as optimization strategies, e.g., cache-conscious chunk size inal one (both with MonetDB and MySQL). With sideways enforcement in partial sideways cracking. Furthermore, row- cracking, though, the initial order is only partially changed, store cracking is a fully unexplored and promising area. providing more efficient access patterns during joins. 2.5 Our final experiment features MonetDB Sideways Cracking a mixed workload. We run 5 8. REFERENCES 2 [1] D. Abadi et al. Integrating compression and execution B1 B2 B3 B4 B5 sequential batches (B1..B5) of in column-oriented database systems. SIGMOD 2006. 1.5 12 different TPC-H queries with varying parameters. The fig- [2] D. Abadi et al. Materialization Strategies in a 1 ure on the left shows the per- Column-Oriented DBMS. ICDE 2007. formance of sideways cracking [3] S. Agrawal et al. Database Tuning Advisor for 0.5 relative to MonetDB. Already Microsoft SQL Server. VLDB 2004. within the first batch (B1), 0 [4] P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Response time relative to MonetDB 1 13 25 37 49 60 sideways cracking outperforms Hyper-Pipelining Query Execution. CIDR 2005. Query sequence MonetDB in many queries. This [5] N. Bruno and S. Chaudhuri. To Tune or not to Tune? is because queries can reuse maps and partitioning informa- A Lightweight Physical Design Alerter. VLDB 2006. tion created by different queries over the same attributes. [6] S. Harizopoulos et al. Performance Tradeoffs in The high peak in the first batch comes from Query 12 that Read-Optimized Databases. VLDB 2006. uses a map set not used by any other query. Naturally, after [7] S. Idreos, M. Kersten, and S. Manegold. Database the first batch sideways cracking improves even more. Cracking. CIDR 2007. [8] S. Idreos, M. Kersten, and S. Manegold. Updating a 6. RELATED WORK Cracked Database. SIGMOD 2007. Self-organization has become an active research area, e.g., [9] M. Kersten and S. Manegold. Cracking the Database [3, 5, 11, 13]. Pioneering literature mainly focuses on pre- Store. CIDR 2005. dicting the future workload as a basis for an appropriate [10] S. Manegold et al. Cache-Conscious Radix-Decluster physical design. This is mainly an off-line task that works Projections. VLDB 2004. well in stable environments or with a delay in dynamic ones. [11] K. Schnaitter et al. COLT: Continuous On-Line In contrast, cracking instantly reacts to every query, refining Database Tuning. SIGMOD 2006. the physical data organization accordingly without the need [12] M. Stonebraker et al. C-Store: A Column Oriented for a workload predictor, or lengthy reorganization delays. DBMS. VLDB 2005. The only other column-store that uses physical reorga- nization is C-Store [12, 1, 2]; each attribute sort order is [13] D. C. Zilio et al. DB2 Design Advisor: Integrated propagated to the rest of the columns in a relation R, main- Automatic Physical Database Design. VLDB 2004. taining multiple projections over R. It targets read-only [14] TPC Benchmark H. http://www.tpc.org/tpch/. scenarios using the luxury of time to pre-sort the database [15] MonetDB. http://monetdb.cwi.nl/.

12