The Architectures Research Group at CWI

Martin Kersten Stefan Manegold Sjoerd Mullender CWI Amsterdam, The Netherlands fi[email protected] http://www.cwi.nl/research-groups/Database-Architectures/

1. INTRODUCTION power provided by the early MonetDB implementa- The Database research group at CWI was estab- tions. In the years following, many technical inno- lished in 1985. It has steadily grown from two PhD vations were paired with strong industrial maturing students to a group of 17 people ultimo 2011. The of the software base. Data Distilleries became a sub- group is supported by a scientific and sidiary of SPSS in 2003, which in turn was acquired a system engineer to keep our machines running. by IBM in 2009. In this short note, we look back at our past and Moving MonetDB Version 4 into the open-source highlight the multitude of topics being addressed. domain required a large number of extensions to the code base. It became of the utmost importance to support a mature implementation of the SQL- 2. THE MONETDB ANTHOLOGY 03 standard, and the bulk of application program- The workhorse and focal point for our research is ming interfaces (PHP, JDBC, Python, Perl, ODBC, MonetDB, an open-source columnar database sys- RoR). The result of this activity was the first official tem. Its development goes back as far as the early open-source release in 2004. A very strong XQuery eighties when our first relational kernel, called Troll, front-end was developed with partners and released was shipped as an open-source product. It spread to in 2005 [1]. ca. 1000 sites world-wide and became part of a soft- MonetDB remains a product well-supported by ware case-tool until the beginning of the nineties. the group. All its members carry out part of the None of the code of this system has survived, but development and maintenance work, handling user ideas and experiences on how to obtain a fast rela- inquiries, or act as guinea pigs of the newly added tional kernel by simplification and explicit materi- features. A thorough daily regression testing infras- alization found their origin during this period. tructure ensures that changes applied to the code The second half of the eighties was spent on build- base survive an attack of ca. 20 platform configu- ing the first distributed main-memory database sys- rations, including several Linux flavors, Windows, tem in the context of the national Prisma project. FreeBSD, Solaris, and MacOS X. A monthly bugfix A fully functional system of 100 processors and a, release and ca. 3 feature releases per year support for that time, wealthy 1 GB of main memory showed our ever growing user community. The web por- the road to develop database technology from a dif- tal 1 provides access to this treasure chest of modern ferent perspective. Design from the processor to the database technology. It all helped us to create and slow disk, rather than the other way around. maintain a stable platform for innovative research Immediately after the Prisma project, a new ker- directions, as summarized below. The MonetDB nel based on Binary Association Tables (BATs) was spin-offcompany was set up to support its market laid out. This storage engine became accessible take-up, to provide a foundation for QA, support, through MIL, a scripting language intended as a and development activities that are hard to justify target for compiling SQL queries. The target ap- in a research institute on an ongoing basis. plication domain was to better support scientific with their (archaic) file structures. It 3. HARDWARE-CONSCIOUS quickly shifted to a more urgent and emerging area. DATABASE TECHNOLOGY Several datamining projects called for better database support. It culminated in our first spin- A key innovation in the MonetDB code base is offcompany, Data Distilleries, in 1995, which based its reliance on hardware conscious . For, their analytical customer relationship suite on the 1http://www.monetdb.org/

SIGMOD Record, December 2011 (Vol. 40, No. 4) 39 advances in speed of commodity CPUs have far out- with a cache-friendly data access pattern. paced advances in RAM latency. Main-memory ac- Memory Access Cost Modeling. For query cess has therefore become a performance bottleneck optimization to work in a cache-conscious environ- for many computer applications, including database ment, and to enable automatic tuning of our cache- management systems; a phenomenon widely known conscious algorithms on different types of hardware, as the “memory wall.” A revolutionary redesign of we developed a methodology for creating cost mod- database architecture was called for in order to take els that takes the cost of memory access into ac- advantage of modern hardware, and in particular to count [16]. The key idea is to abstract data struc- avoid hitting this memory wall. Our pioneering re- tures as data regions and model the complex data search on columnar and hardware-aware database access patterns of database algorithms in terms of technology, as materialized in MonetDB, is widely simple compounds of a few basic data access pat- recognized, as indicated by the VLDB 2009 10-year terns. We developed cost functions to estimate the Best Paper Award [19, 2] and two DaMoN best pa- cache misses for each basic pattern, and rules to per awards [22, 6]. Here, we briefly highlight im- combine basic cost functions and derive the cost portant milestones. functions of arbitrarily complex patterns. The total Vertical Storage. Whereas traditionally, rela- cost is then the number of cache misses multiplied tional database systems store data in a row-wise by their latency. In order to work on diverse com- fashion (which favors single record lookups), Mon- puter architectures, these models are parametrized etDB uses a columnar storage, which favors analysis at run-time using automatic calibration techniques. queries by better using CPU cache lines. Vectorized Execution. In the “X100” project, Bulk Query Algebra. Much like the CISC vs. we explored a compromise between classical tuple- RISC idea applied to CPU design, the MonetDB re- at-a-time pipelining and operator-at-a-time bulk lational algebra is deliberately simplified compared processing [3]. The idea of vectorized execution to the traditional relational set algebra. Paired is to operate on chunks (vectors) of data that are with an operator-at-a-time bulk execution model, large enough to amortize function call overheads, rather than the traditional tuple-at-a-time pipelin- but small enough to fit in CPU caches to avoid ing model, this allows for much faster implementa- materialization into main memory. Combined with tion on modern hardware, as the code requires far just-in-time light-weight compression, it lowers the fewer function calls and conditional branches. memory wall somewhat. The X100 project has been Cache-conscious Algorithms. The crucial as- commercialized into the / com- pect to overcome the memory wall is good use of pany and product line 2. CPU caches, for which careful tuning of memory ac- cess patterns is needed. This led to a new breed of 4. DISTRIBUTED PROCESSING query processing algorithms. Their key requirement After more than a decade of rest at the frontier of is to restrict any random data access pattern to data distributed database processing, we embarked upon regions that fit into the CPU caches to avoid cache several innovative projects in this area again. misses, and thus, performance degradation. For in- Armada. An adventurous project was called Ar- stance, partitioned hash-join [2] first partitions both mada where we searched for technology to create relations into H separate clusters that each fit into a fully autonomous and self regulating distributed the CPU caches. The join is then performed per database system [5]. The research hypothesis was pair of matching clusters, building and probing the to organize a large collection of database instances hash- on the inner relation entirely inside the around a dynamically partitioned database. Each CPU cache. With large relations and small CPU time an instance ran out of resources, it could so- caches, efficiently creating a large number of clus- licit a spare machine and decide autonomously on ters can become a problem in itself. If H exceeds the what portion to delegate to its peer. The decisions number of TLB entries or cache lines, each memory were reflected in the SQL catalog which triggered reference will trigger a TLB or cache miss, compro- continuous adaptive query modification to hunt af- mising the performance significantly. With radix- ter the portions in the loosely connected network of cluster [17], we prevent that problem by perform- workers. It never matured as part of the MonetDB ing the clustering in multiple passes, such that each distribution, because at that time we did not have pass creates at most as many new sub-clusters as all the basic tools to let it fly. there are TLB entries or cache lines. With radix- Since, the Merovingian toolkit developed and now decluster [18], we complement partitioned hash-join provides the basis for massive distributed process- with a projection (tuple reconstruction) 2http://www.actian.com/products/vectorwise/

40 SIGMOD Record, December 2011 (Vol. 40, No. 4) ing. It provides server administration, server dis- query processing can take place. We start from a covery features, client proxying and funneling to single master node in control of the database, and accommodate large numbers of (web) clients, basic with a variable number of worker nodes to be used distributed (multiplex) query processing, and fail- for delegated query processing. Data is shipped over functionality for a large number of MonetDB just-in-time to the worker nodes using a need-to- servers in a network. It is the toolkit used by part- know policy, and reused, if possible, in subsequent ner companies to build distributed datawarehouse queries. A bidding mechanism among the workers solutions. With Merovingian we were able to open yields the most efficient reuse of parts of the orig- two new research tracks: DataCyclotron and Octo- inal data, available on the workers from previous pus. Our new machine cluster 3 provides a basis to queries. explore both routes in depth. The adaptive distributed architecture uses the DataCyclotron. The grand challenge of dis- master/workers paradigm: the master hosts the tributed query processing is to devise a self-organi- database and computes a query by generating dis- zing architecture which exploits all hardware re- tributed subqueries for as many workers as it has sources optimally to manage the database hot-set, currently available. The workers recycle the data to minimize query response time, and to maxi- they have processed in the past as much as pos- mize throughput without single point global co- sible to minimize the data transfer costs. Due to ordination. The Data Cyclotron architecture [4] the just-in-time replication, the system easily har- addresses this challenge using turbulent data move- vests non-dedicated computational resources, while ment through a storage ring built from distributed supporting full SQL query expressiveness. main memory and capitalizing on the functionality Our experiments show that the proposed adap- offered by modern remote-DMA network facilities. tive distributed architecture is a viable and flexible Queries assigned to individual nodes interact with approach for improving the query performance of the storage ring by picking up data fragments that a dedicated database server by using non-dedicated are continuously flowing around, i.e., the hot-set. worker nodes, reaching benefits comparable to tra- The storage ring is steered by the level of inter- ditional distributed databases. est (LOI) attached to each data fragment. The LOI represents the cumulative query interest as it passes around the ring multiple times. A fragment 5. ADAPTIVE INDEXING with LOI below a given threshold, inversely pro- Query performance strongly depends on finding portional to the ring load, is pulled out to free up an execution plan that touches as few superfluous resources. This threshold is dynamically adjusted tuples as possible. The access structures deployed in a fully distributed manner based on ring charac- for this purpose, however, are non-discriminative. teristics and locally observed query behavior. It op- They assume every subset of the domain being in- timizes resource utilization by keeping the average dexed is equally important, and their structures data access latency low. The approach is illustrated cause a high maintenance overhead during updates. using an extensive and validated simulation study. Moreover, while hard in general, the task of finding The results underpin the fragment hot-set manage- the optimal set of indices becomes virtually impos- ment robustness in turbulent workload scenarios. sible in scenarios with unpredictable workloads. A fully functional prototype of the proposed ar- With Database Cracking, we take a completely chitecture has been implemented using modest ex- different approach. Database cracking combines tensions to MonetDB and runs within a multi-rack features of automatic index selection and partial in- cluster equipped with Infiniband. Extensive exper- dexes. Instead of requiring a priori workload knowl- imentation using both micro benchmarks and high- edge to build entire indices prior to query process- volume workloads based on TPC-H demonstrates ing, it takes each query predicate as a hint how to its feasibility. The Data Cyclotron architecture and physically reorganize the data. Continuous physical experiments open a new vista for modern in-the- data reorganization is performed on-the-fly during network distributed database architectures with a query processing, integrated in the query operators. plethora of research challenges. When a column is queried by a predicate for the first Octopus. In the Octopus project, we deviate time, a new cracker index is initialized. As the col- from the predominant approach in distributed data- umn is used in the predicates of further queries, the base processing, where the data is spread across a cracker index is refined by range partitioning until number of machines one way or another before any sequentially searching a partition is faster than bi- nary searching in the AVL tree guiding a search to 3http://www.scilens.org/platform/ the appropriate partition.

SIGMOD Record, December 2011 (Vol. 40, No. 4) 41 Keys in a cracker index are partitioned into dis- plementation of SkyServer 4. It proved that the col- joint key ranges, but left unsorted within each par- umn store approach of MonetDB has a great poten- tition. Each range query analyzes the cracker index, tial in the world of scientific databases. However, scans key ranges that fall entirely within the query the application also challenged the functionality of range, and uses the two end points of the query our implementation and revealed that a fully op- range to further partition the appropriate two key erational SQL environment is needed, e.g., includ- ranges. Thus, in most cases, each partitioning step ing persistent stored modules. Its initial perfor- creates two new sub-partitions using logic similar mance was competitive to the reference platform, to partitioning in quicksort. A range is partitioned Microsoft SQL Server 2005, and the analysis of into 3 sub-partitions if both end points fall into the SDSS query traces hinted at several techniques to same key range. This happens in the first partition- boost performance by utilizing repetitive behavior ing step in a cracker index (because there is only and zoom-in/zoom-out access patterns that were one key range encompassing all key values) but is not captured by the system. unlikely thereafter [7]. Recycler. An immediate follow up project fo- Updates and their efficient integration into the cused on developing a recycler component to Mon- data structure are covered in [8]. Multi-column in- etDB. It acts as an intelligent cache of all intermedi- dexes to support selections, tuple reconstructions ate results. Avoiding recomputing of any subquery and general complex queries are covered in [9]. In as often as possible, within the confines of the stor- addition, [9] supports partial materialization and age set aside for the intermediates. The results were adaptive space management via partial cracking. published in 2009 at SIGMOD and received the run- While database cracking comes with very low ner up best paper award [11]. overhead but slow convergence towards a fully opti- Recycling can be considered an adaptive materi- mized index, adaptive merging features faster con- alized view scheme. Any subquery can be re-used, vergence at the expense of a significantly higher there is no a priori decision needed by a human overhead. Hybrid adaptive indexing aims at DBA. It is also more effective than recycling only achieving a faster convergence while keeping the the final query result sets. Integration of the recy- overhead low as with database cracking [10]. cler with the SDSS application showed that a few With stochastic cracking, we introduce a sig- materialized views had been forgotten in the origi- nificantly more resilient approach to adaptive in- nal design, which would have improved throughput dexing. Stochastic cracking does use each query as significantly. This was found without human inter- advice on how to reorganize data, but not blindly so; vention. it gains in resilience and avoids performance bottle- SciBORQ. Scientific discovery has shifted from necks by allowing for lax and arbitrary choices in its being an exercise of theory and computation, to be- decision-making. Thereby, we bring adaptive index- come the exploration of an ocean of observational ing forward to a mature formulation that confers the data. This transformation was identified by Jim workload-robustness previous approaches lacked. Gray as the 4th paradigm of scientific discovery. Ongoing work aims at combining adaptive index- State-of-the-art observatories, digital sensors, and ing techniques with the ideas of physical design and modern scientific instruments produce Petabytes of auto-tuning tools. The goal is to exploit workload information every day. This scientific data is stored knowledge to steer adaptive indexing where possi- in massive data centers for later analysis. But even ble, but keep the flexibility and instant adaptation from the data management viewpoint, the capture, to changing workloads of adaptive indexing. curating, and analysis of data is not a computa- tion intensive process any more, but a data inten- sive one. The explosion in the amount of scientific 6. SCIENTIFIC DATABASES data presents a new “stress test” for database de- After the first open-source release of MonetDB, sign. Meanwhile, the scientists are confronted with we were keen to check its behavior on real-life exam- new questions, how can relevant and compact infor- ples beyond the classical benchmarks. The largest, mation be found from such a flood of data? well-documented and publicly available dataware- Data warehouses underlying Virtual Observato- house was the Sloan Digital Sky Survey (SDSS) ries stress the capabilities of database management / SkyServer. Embarking on its re-implementation systems in many ways. They are filled on a daily was a challenge. None of the other DBMSs had ac- basis with gigabytes of factual information, derived complished a working implementation, either due to from large data scrubbing and computational in- its complexity or lack of resources (business drive). Skyserver. We achieved a fully functional im- 4see http://www.scilens.org/

42 SIGMOD Record, December 2011 (Vol. 40, No. 4) tensive feature extraction pipelines. The predom- ferent route by designing a stream engine on top inant data processing techniques focus on massive of an existing relational database kernel [15]. This parallel loads and map-reduce algorithms. Such a includes reuse of both its storage/execution engine brute force approach, albeit effective in many cases, and its optimizer infrastructure. The major chal- is costly. lenge then becomes the efficient support for spe- In the SciBORQ project, we explore a different cialized stream features. route [21]. One based on the knowledge that only We focus on incremental window-based process- a small fraction of the data is of real value for any ing, arguably the most crucial stream-specific re- specific task. This fraction becomes the focus of sci- quirement. In order to maintain and reuse the entific reflection through an iterative process of ad- generic storage and execution model of the DBMS, hoc query refinement. However, querying a multi- we elevate the problem to the query plan level. terabyte database requires a sizable computing clus- Proper optimizer rules, scheduling and intermedi- ter, while ideally the initial investigation should run ate result caching and reuse, allow us to modify on the scientist’s laptop. the DBMS query plans for efficient incremental pro- We work on strategies on how to make biased cessing. In extensive experiments, DataCell demon- snapshots of a science warehouse such that data strates efficient performance even compared to spe- exploration can be instigated using precise con- cialized stream engines, especially when scalability trol over all resources. These snapshots, constructed becomes a crucial factor. with novel sampling techniques, are called impres- sions. An impression is selected such that either 8. GRAPH DATABASES the statistical error of a query answer remains low, As database kernel hackers we can not escape or an answer can be produced within strict time the semantic web wave. RDF and triple stores re- bounds. Impressions differ from previous sampling quirements are also challenging the MonetDB ker- approaches because of their bias towards the focal nel. In a recent paper [20], we showed how exist- point of the scientist’s data exploration. ing database technology can provide a sound basis for these environments. The base performance of 7. STREAMING MonetDB for graph database is superb, but per- DataCell. Streaming applications have been en haps we may find novel tricks when a complete vogue for over a decade now and continuous query SPARQL front-end emerges on top of it. Most processing has emerged as a promising paradigm likely, we can re-use many of the techniques devel- with numerous applications. A more recent devel- oped in the context of MonetDB/XQuery, in par- opment is the need to handle both streaming queries ticular run-time query optimization [12]. Never- and typical one-time queries in the same applica- theless, we did not chicken out and got ourselves tion setting, e.g., complex event processing (CEP). lured into European development projects to pro- For example, data warehousing can greatly benefit mote Linked-Open-Data. A step towards this goal from the integration of stream semantics, i.e., on- is to carve out a benchmark that would shed light line analysis of incoming data and combination with on the requirements in this field. existing data. This is especially useful to provide low latency in data intensive analysis in big data 9. FUTURE warehouses that are augmented with new data on a Despite the broad portfolio of topics, there is a daily basis. strong drive and interest in pushing the boundaries However, state-of-the-art database technology of our knowledge by seeking areas hitherto unex- cannot handle streams efficiently due to their “con- plored. The mission for the future is to seek so- tinuous” nature. At the same time, state-of-the-art lutions where the DBMS interpret queries by their stream technology is purely focused on stream ap- intent, rather than as a contract carved in stone plications. The research efforts are mostly geared for complete and correct answers. The result set towards the creation of specialized stream man- should aid the user in understanding the database’s agement systems built with a different philosophy content and provide guidance to continue his data than a DBMS. The drawback of this approach is exploration journey. A scientist can stepwise ex- the limited opportunities to exploit successful past plore deeper and deeper into the database, and stop data processing technology, e.g., query optimization when the result content and quality reaches his sat- techniques. isfaction point. At the same time, response times For this new problem we combine the best of both should be close to instant such that they allow a sci- worlds. In the DataCell project [14] we take a dif- entist to interact with the system and explore the

SIGMOD Record, December 2011 (Vol. 40, No. 4) 43 data in a contextualized way. [8] S. Idreos, M. L. Kersten, and S. Manegold. Updating In our recent VLDB 2011 Challenges & Visions aCrackedDatabase.InProc. of the ACM Int’l Conf. on Management of Data (SIGMOD),pages413–424, paper [13], we chartered a route for such ground- June 2007. breaking database research along five dimensions: [9] S. Idreos, M. L. Kersten, and S. Manegold. Self-organizing Tuple Reconstruction in - One-minute DBMS for real-time performance. Column-stores. In Proc. of the ACM Int’l Conf. on Management of Data (SIGMOD),pages297–308, - Multi-scale query processing. June 2009. [10] S. Idreos, S. Manegold, H. Kuno, and G. Graefe. - Post processing for conveying meaningful data. Merging What’s Cracked, Cracking What’s Merged: Adaptive Indexing in Main-Memory Column-Stores. - Query morphing to adjust for proximity results. Proceedings of the VLDB Endowment (PVLDB), - Query alternatives for lack of providence. 4(9):585–597, June 2011. [11] M. Ivanova, M. L. Kersten, N. Nes, and . Goncalves. An Architecture for Recycling Intermediates in a Each direction would serve several PhDs and pro- Column-store. In Proc. of the ACM Int’l Conf. on duce a database system with little resemblance to Management of Data (SIGMOD),pages309–320, what we have built over the last thirty years. We June 2009. [12] R. A. Kader, P. A. Boncz, S. Manegold, and M. van look forward to seeing members of the database re- Keulen. ROX: Run-time Optimization of XQueries. In search community join our mission and take up the Proc. of the ACM Int’l Conf. on Management of Data challenges expressed. (SIGMOD),pages615–626,June2009. [13] M. L. Kersten, S. Idreos, S. Manegold, and E. Liarou. The Researcher’s Guide to the Data Deluge: Querying 10. ACKNOWLEDGMENTS aScientificDatabaseinJustaFewSeconds. This research was made possible with the dedi- Proceedings of the VLDB Endowment (PVLDB), 4(12):1474–1477, Aug. 2011. cation of the 2011 research team comprised of Niels [14] M. L. Kersten, E. Liarou, and R. Goncalves. A Query Nes, , Fabian Groffen, Ying Zhang, Language for a Data Refinery Cell. In Proc. of the Stratos Idreos, Milena Ivanova, Romulo Goncavles, Int’l Workshop on Event-driven Architecture, Processing and Systems (EDA-PS),Sept.2007. Erietta Liarou, Lefteris Sidirourgos, Arjen de Rijke, [15] E. Liarou, R. Goncalves, and S. Idreos. Exploiting the Holger Pirk, Phan Minh Duc, Bart Scheers, Eleni Power of Relational Databases for Efficient Stream Petraki, Thibault Sellam, Yagiz Kargin. Processing. In Proc. of the Int’l Conf. on Extending Database Technology (EDBT),pages323–334,Mar. 2009. 11. REFERENCES [16] S. Manegold, P. A. Boncz, and M. L. Kersten. Generic [1] P. A. Boncz, T. Grust, M. van Keulen, S. Manegold, Database Cost Models for Hierarchical Memory J. Rittinger, and J. Teubner. MonetDB/XQuery: A Systems. In Proc. of the Int’l Conf. on Very Large Fast XQuery Processor Powered by a Relational Data Bases (VLDB),pages191–202,Aug.2002. Engine. In Proc. of the ACM Int’l Conf. on [17] S. Manegold, P. A. Boncz, and M. L. Kersten. Management of Data (SIGMOD),pages479–490, Optimizing Main-Memory Join On Modern Hardware. June 2006. IEEE Transactions on Knowledge and Data [2] P. A. Boncz, S. Manegold, and M. L. Kersten. Engineering (TKDE),14(4):709–730,July2002. Database Architecture Optimized for the New [18] S. Manegold, P. A. Boncz, N. Nes, and M. L. Kersten. Bottleneck: Memory Access. In Proc. of the Int’l Cache-Conscious Radix-Decluster Projections. In Conf. on Very Large Data Bases (VLDB),pages Proc. of the Int’l Conf. on Very Large Data Bases 54–65, Sept. 1999. (VLDB),pages684–695,Aug.2004. [3] P. A. Boncz, M. Zukowski, and N. Nes. [19] S. Manegold, M. L. Kersten, and P. A. Boncz. MonetDB/X100: Hyper-Pipelining Query Execution. Database Architecture Evolution: Mammals In Proc. of the Int’l Conf. on Innovative Data Flourished long before Dinosaurs became Extinct. Systems Research (CIDR),pages225–237,Jan.2005. Proceedings of the VLDB Endowment (PVLDB), [4] R. Goncalves and M. L. Kersten. The Data Cyclotron 2(2):1648–1653, Aug. 2009. Query Processing Scheme. In Proc. of the Int’l Conf. [20] L. Sidirourgos, R. Goncalves, M. L. Kersten, N. Nes, on Extending Database Technology (EDBT),pages and S. Manegold. Column-Store Support for RDF 75–86, Mar. 2010. Data Management: not all swans are white. In Proc. [5] F. Groffen, M. L. Kersten, and S. Manegold. Armada: of the Int’l Conf. on Very Large Data Bases (VLDB), aReferenceModelforanEvolvingDatabaseSystem. pages 1553–1563, Sept. 2008. In Proc. of the GI-Fachtagung Datenbanksysteme in [21] L. Sidirourgos, M. L. Kersten, and P. A. Boncz. Business, Technologie und Web (BTW),pages SciBORQ: Scientific data management with Bounds 417–435, Mar. 2007. On Runtime and Quality. In Proc. of the Int’l Conf. [6] S. H´eman, N. Nes, M. Zukowski, and P. A. Boncz. on Innovative Data Systems Research (CIDR),pages Vectorized data processing on the cell broadband 296–301, Jan. 2011. engine. In Proc. of the Int’l Workshop on Data [22] M. Zukowski, N. Nes, and P. A. Boncz. DSM vs. Management on New Hardware (DaMoN),page4, NSM: CPU performance tradeoffs in block-oriented June 2007. query processing. In Proc. of the Int’l Workshop on [7] S. Idreos, M. L. Kersten, and S. Manegold. Database Data Management on New Hardware (DaMoN), Cracking. In Proc. of the Int’l Conf. on Innovative pages 47–54, June 2008. Data Systems Research (CIDR),pages68–78,Jan. 2007.

44 SIGMOD Record, December 2011 (Vol. 40, No. 4)