Neo: a Learned Query Optimizer

Neo: A Learned Query Optimizer Ryan Marcus1, Parimarjan Negi2, Hongzi Mao2, Chi Zhang1, Mohammad Alizadeh2, Tim Kraska2, Olga Papaemmanouil1, Nesime Tatbul23 1Brandeis University 2MIT 3Intel Labs ABSTRACT timizers [35,60]. For example, DQ [22] and ReJOIN [34] use Query optimization is one of the most challenging problems reinforcement learning combined with a human-engineered in database systems. Despite the progress made over the cost model to automatically learn search strategies to ex- past decades, query optimizers remain extremely complex plore the space of possible join orderings. While these papers components that require a great deal of hand-tuning for show that learned search strategies can outperform conven- specific workloads and datasets. Motivated by this short- tional heuristics on the provided cost model, they do not coming and inspired by recent advances in applying machine show a consistent or significant impact on actual query per- learning to data management challenges, we introduce Neo formance. Moreover, they still rely on the optimizer's heuris- (Neural Optimizer), a novel learning-based query optimizer tics for cardinality estimation, physical operator selection, that relies on deep neural networks to generate query exe- and estimating the cost of candidate execution plan. cutions plans. Neo bootstraps its query optimization model Other approaches demonstrate how machine learning can from existing optimizers and continues to learn from incom- be used to achieve better cardinality estimates [20, 43, 44]. ing queries, building upon its successes and learning from its However, none demonstrate that their improved cardinality failures. Furthermore, Neo naturally adapts to underlying estimations actually lead to better query plans. It is rel- data patterns and is robust to estimation errors. Experimen- atively easy to improve the average error of a cardinality tal results demonstrate that Neo, even when bootstrapped estimation, but much harder to improve estimations for the from a simple optimizer like PostgreSQL, can learn a model cases that actually improve query plans [26]. Furthermore, that offers similar performance to state-of-the-art commer- cardinality estimation is only one component of an opti- cial optimizers, and in some cases even surpass them. mizer. Unlike join order selection, identifying join operators (e.g., hash join, merge join) and selecting indexes cannot be (entirely) reduced to cardinality estimation. Finally, Skin- 1. INTRODUCTION nerDB showed that adaptive query processing strategies can In the face of a never-ending deluge of machine learning benefit from reinforcement learning, but requires a special- success stories, every database researcher has likely won- ized query execution engine, and cannot benefit from oper- dered if it is possible to learn a query optimizer. Query op- ator pipelining or other advanced parallelism models [56]. timizers are key to achieving good performance in database In other words, none of the recent machine-learning-based systems, and can speed up the execution time of queries by approaches come close to learning an entire optimizer, nor orders of magnitude. However, building a good optimizer do they show how their techniques can achieve state-of-the- today takes thousands of person-engineering-hours, and is art performance (to the best of our knowledge, none of these an art only a few experts fully master. Even worse, query approaches compare with a commercial optimizer). Showing optimizers need to be tediously maintained, especially as the that an entire optimizer can be learned remains an impor- system's execution and storage engines evolve. As a result, tant milestone and has far reaching implications. Most none of the freely available open-source query optimizers importantly, if a learned query optimizer could achieve per- arXiv:1904.03711v1 [cs.DB] 7 Apr 2019 come close to the performance of the commercial optimizers formance comparable to commercial systems after a short offered by IBM, Oracle, or Microsoft. amount of training, the amount of human development time Due to the heuristic-based nature of query optimization, to create a new query optimizer will be significantly reduced. there have been many attempts to improve query optimizers This, in turn, will make good optimizers available to a much through learning over the last several decades. For exam- broader range of systems, and could improve the perfor- ple, almost two decades ago, Leo, DB2's LEarning Opti- mance of thousands of applications that use open-source mizer, was proposed [54]. Leo learns from its mistakes by databases today. Furthermore, it could change the way adjusting its cardinality estimations over time. However, query optimizers are built, replacing an expensive stable of Leo still requires a human-engineered cost model, a hand- heuristics with a more holistic optimization problem. This picked search strategy, and a lot of developer-tuned heuris- should result in better maintainability, as well as lead to op- tics, which take years to develop and perfect. Furthermore, timizers that will truly learn from their mistakes and adjust Leo only learns better cardinality estimations, but not new their entire strategy for a particular customer instance to optimization strategies (e.g., how to account for uncertainty achieve instance optimality [21]. in cardinality estimates, operator selection, etc.). In this work, we take a significant step towards the mile- More recently, the database community has started to ex- stone of building an end-to-end learned optimizer with state- plore how neural networks can be used to improve query op- 1 of-the-art performance. To the best of our knowledge, this Neo E work is the first to show that an entire query optimizer can be x p Sample Expert e learned. Our learned optimizer is able to achieve similar per- Q r Q t Optimizer i Workload Q s formance to state-of-the-art commercial optimizers, e.g., Or- Executed Plans e acle and Microsoft, and sometimes even surpass them. This required overcoming several key challenges, from capturing R User Query Q’ Featurizer E u x n p query semantics as vectors, processing tree-based query plan s t l r e i h m o e r t c i r d e c structures, designing a search strategy, incorporating physi- Prediction e e a o n v e c M e S cal operator and index selection, replacing human-engineered w e o r n u l a l cost models with a neural network, adopting reinforcement a P learning techniques to continuously improve, and signifi- V cantly shorting the training time for a given database. All Latency these techniques were integrated into the first end-to-end Selected plan learned query optimizer, called Neo (Neural Optimizer). Database Execution Engine Neo learns to make decisions about join ordering, physical operator selection, and index selection. However, we have Figure 1: Neo system model not yet reached the milestone of learning these tasks from scratch. First, Neo still requires a-priori knowledge about all • We show that, after training for a dataset and repre- possible query rewrite rules (this guarantees semantic cor- sentative sample query workload, Neo is able to gener- rectness and the number of rules are usually small). Second, alize even over queries it has not encountered before. we restrict Neo to project-select-equijoin-aggregate-queries • We evaluate different feature engineering techniques (though, our framework is general and can easily be ex- and propose a new technique, which implicitly repre- tended). Third, our optimizer does not yet generalize from sents correlations within the dataset. one database to another, as our features are specific to a set • We show that, after a short amount of training, Neo of tables | however, Neo does generalize to unseen queries, is able to achieve performance comparable to Oracle's which can contain any number of known tables. Fourth, Neo and Microsoft's query optimizers on their respective requires a traditional (weaker) query optimizer to bootstrap database systems. its learning process. As proposed in [35], we use the tradi- The remainder of the paper is organized as follows: Sec- tional query optimizer as a source of expert demonstration, tion 2 provides an overview of Neo's learning framework and which Neo uses to bootstrap its initial policy. This tech- system model. Section 3 describes how queries and query nique, referred to as \learning from demonstration" [9, 16, plans are represented by Neo. Section 4 explains Neo's value 49, 50] significantly speeds up the learning process, reduc- network, the core learned component of our system. Sec- ing training time from days/weeks to just a few hours. The tion 5 describes row vector embeddings, an optional learned query optimizer used to bootstrap Neo can be much weaker representation of the underlying database that helps Neo in performance and, after an initial training period, Neo sur- understand correlation within the user's data. We present passes its performance and it is no longer needed. For this an experimental evaluation of Neo in Section 6, discuss re- work, we use the PostgreSQL optimizer, but any traditional lated works in Section 7, and offer concluding remarks in (open source) optimizer can be used. Section 8. Our results show that Neo outperforms commercial optimizers on their own query execution engines, even when 2. LEARNING FRAMEWORK OVERVIEW it is boostrapped using the PostgreSQL optimizer. Inter- What makes Neo unique that it is the very first end-to-end estingly, Neo learns to automatically adjust to changes in query optimizer. As shown in Table 1, it replaces every com- the accuracy of cardinality predictions (e.g., it picks more ponent of a traditional Selinger-style [52] query optimizers robust plans if the cardinality estimates are less precise). through machine learning models: (i) the query representa- Further, it can be tuned depending on the customer pref- tion is through features rather than an object-based query erences (e.g., increase worst-case performance vs.

Load more