Automatic Database Management System Tuning Through Large-Scale Machine Learning
Total Page:16
File Type:pdf, Size:1020Kb
Automatic Database Management System Tuning Through Large-scale Machine Learning Dana Van Aken Andrew Pavlo Geoffrey J. Gordon Bohan Zhang Carnegie Mellon University Carnegie Mellon University Carnegie Mellon University Peking University [email protected] [email protected] [email protected] [email protected] ABSTRACT many configuration knobs [22, 47, 36]. Part of what makes DBMSs Database management system (DBMS) configuration tuning is an so enigmatic is that their performance and scalability are highly de- essential aspect of any data-intensive application effort. But this pendent on their configurations. Further exacerbating this problem is historically a difficult task because DBMSs have hundreds of is that the default configurations of these knobs are notoriously bad. configuration “knobs” that control everything in the system, such As an example, the default MySQL configuration in 2016 assumes as the amount of memory to use for caches and how often data that it is deployed on a machine that only has 160 MB of RAM [1]. is written to storage. The problem with these knobs is that they Given this, many organizations resort to hiring expensive experts are not standardized (i.e., two DBMSs use a different name for the to configure the system’s knobs for the expected workload. But as same knob), not independent (i.e., changing one knob can impact databases and applications grow in both size and complexity, opti- others), and not universal (i.e., what works for one application may mizing a DBMS to meet the needs of an application has surpassed be sub-optimal for another). Worse, information about the effects the abilities of humans [11]. This is because the correct configura- of the knobs typically comes only from (expensive) experience. tion of a DBMS is highly dependent on a number of factors that are To overcome these challenges, we present an automated approach beyond what humans can reason about. that leverages past experience and collects new information to tune Previous attempts at automatic DBMS configuration tools have DBMS configurations: we use a combination of supervised and un- certain deficiencies that make them inadequate for general purpose supervised machine learning methods to (1) select the most impact- database applications. Many of these tuning tools were created ful knobs, (2) map unseen database workloads to previous work- by vendors, and thus they only support that particular company’s loads from which we can transfer experience, and (3) recommend DBMS [22, 33, 37]. The small number of tuning tools that do sup- knob settings. We implemented our techniques in a new tool called port multiple DBMSs still require manual steps, such as having the OtterTune and tested it on three DBMSs. Our evaluation shows that DBA (1) deploy a second copy of the database [24], (2) map depen- OtterTune recommends configurations that are as good as or better dencies between knobs [49], or (3) guide the training process [58]. than ones generated by existing tools or a human expert. All of these tools also examine each DBMS deployment indepen- dently and thus are unable to apply knowledge gained from previ- ous tuning efforts. This is inefficient because each tuning effort can 1. INTRODUCTION take a long time and use a lot of resources. The ability to collect, process, and analyze large amounts of data In this paper, we present a technique to reuse training data gath- is paramount for being able to extrapolate new knowledge in busi- ered from previous sessions to tune new DBMS deployments. The ness and scientific domains [35, 25]. DBMSs are the critical com- crux of our approach is to train machine learning (ML) models ponent of data-intensive (“Big Data”) applications [46]. The per- from measurements collected from these previous tunings, and use formance of these systems is often measured in metrics such as the models to (1) select the most important knobs, (2) map previ- throughput (e.g., how fast it can collect new data) and latency (e.g., ously unseen database workloads to known workloads, so that we how fast it can respond to a request). can transfer previous experience, and (3) recommend knob settings Achieving good performance in DBMSs is non-trivial as they are that improve a target objective (e.g., latency, throughput). Reusing complex systems with many tunable options that control nearly all past experience reduces the amount of time and resources it takes aspects of their runtime operation [24]. Such configuration knobs to tune a DBMS for a new application. To evaluate our work, we allow the database administrator (DBA) to control various aspects implemented our techniques using Google TensorFlow [50] and of the DBMS’s runtime behavior. For example, they can set how Python’s scikit-learn [39] in a tuning tool, called OtterTune, much memory the system allocates for data caching versus the and performed experiments for two OLTP DBMSs (MySQL, Post- transaction log buffer. Modern DBMSs are notorious for having gres) and one OLAP DBMS (Vector). Our results show that Ot- terTune produces a DBMS configuration for these workloads that Permission to make digital or hard copies of all or part of this work for personal or achieves 58–94% lower latency compared to their default settings classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation or configurations generated by other tuning advisors. We also show on the first page. Copyrights for components of this work owned by others than the that OtterTune generates configurations in under 60 min that are author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or within 94% of ones created by expert DBAs. republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. The remainder of this paper is organized as follows. Sect. 2 be- SIGMOD’17, May 14 - 19, 2017, Chicago, IL, USA gins with a discussion of the challenges in database tuning. We then © 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM. provide an overview of our approach in Sect. 3, followed by a de- ISBN 978-1-4503-4197-4/17/05. $15.00 scription of our techniques for collecting DBMS metrics in Sect. 4, DOI: http://dx.doi.org/10.1145/3035918.3064029 6.0 2.0 3.0 600 Workload #1 MySQL 1.5 Workload #2 Postgres 1.0 2.0 4.0 Workload #3 400 0.5 99th %-tile (sec) 0.0 1.0 2.0 200 Log file200 size (MB) 99th %-tile (sec) 400 99th %-tile (sec) 600 500 Number of knobs 1000 0.0 0.0 800 1500 500 1000 1500 2000 2500 3000 Config #1 Config #2 Config #3 0 2500 2000 2000 2004 2008 2012 2016 Buffer pool size (MB) Buffer pool size (MB) Release date (a) Dependencies (b) Continuous Settings (c) Non-Reusable Configurations (d) Tuning Complexity Figure 1: Motivating Examples – Figs. 1a to 1c show performance measurements for the YCSB workload running on MySQL (v5.6) using different configuration settings. Fig. 1d shows the number of tunable knobs provided in MySQL and Postgres releases over time. identifying the knobs that have the most impact in Sect. 5, and rec- ences in performance from one setting to the next could be irreg- ommending settings in Sect. 6. In Sect. 7, we present our experi- ular. For example, the size of the DBMS’s buffer pool can be an mental evaluation. Lastly, we conclude with related work in Sect. 8. arbitrary value from zero to the amount of DRAM on the system. In some ranges, a 0.1 GB increase in this knob could be incon- 2. MOTIVATION sequential, while in other ranges, a 0.1 GB increase could cause performance to drop precipitously as the DBMS runs out of phys- There are general rules or “best practice” guidelines available ical memory. To illustrate this point, we ran another experiment for tuning DBMSs, but these do not always provide good results where we increase MySQL’s buffer pool size from 10 MB to 3 GB. for a range of applications and hardware configurations. Although The results in Fig. 1b show that the latency improves continuously one can rely on certain precepts to achieve good performance on a up until 1.5 GB, after which the performance degrades because the particular DBMS, they are not universal for all applications. Thus, DBMS runs out of physical memory. many organizations resort to hiring expensive experts to tune their system. For example, a 2013 survey found that 40% of engagement Non-Reusable Configurations: The effort that a DBA spends requests for a large Postgres service company were for DBMS tun- on tuning one DBMS does not make tuning the next one any eas- ing and knob configuration issues [36]. ier. This is because the best configuration for one application may One common approach to tuning a DBMS is for the DBA to copy not be the best for another. In this experiment, we execute three the database to another machine and manually measure the perfor- YCSB workloads using three MySQL knob configurations. Each mance of a sample workload from the real application. Based on configuration is designed to provide the best latency for one of the the outcome of this test, they will then tweak the DBMS’s configu- workloads (i.e., config #1 is the best for workload #1, same for #2 ration according to some combination of tuning guidelines and in- and #3). Fig. 1c shows that the best configuration for each workload tuition based on past experiences.