Evaluating Database Self-Tuning Strategies in a Common Extensible Framework

Evaluating Database Self-Tuning Strategies in a Common Extensible Framework Rafael Pereira de Oliveira1, Sergio Lifschitz1, Marcos Kalinowski1, Marx Viana1, Carlos Lucena1, Marcos Antonio Vaz Salles2 1Pontifical Catholic University of Rio de Janeiro (PUC-Rio) 2University of Copenhagen Abstract. Database automatic tuning tools are an essential class of database applications for database administrators (DBAs) and researchers. These self- management systems involve recurring and ubiquitous tasks, such as data extraction for workload acquisition and more specific features that depend on the tuning strategy, such as the specification of tuning action types and heuristics. Given the variety of approaches and implementations, it would be desirable to evaluate existing database self-tuning strategies, particularly recent and new heuristics, in a standard testbed. In this paper, we propose a reuse-oriented framework approach towards assessing and comparing automatic relational database tuning strategies. We employ our framework to instantiate three customized automated database tuning tools extended from our framework kernel, employing strategies using combinations of different tuning actions (indexes, partial indexes, and materialized views) for various RDBMS. Finally, we evaluate the effectiveness of these tools using a known database benchmark. Our results show that the framework enabled instantiating useful self-tuning tools with a low effort by just extending well-defined framework hot-spots. Hence, sharing it with the database research community may facilitate evolving state of the art on self-tunning strategies by enabling to evaluate different strategies on different RDBMS, serving as a common and extensible testbed. 1. Introduction We may find many different self-tuning systems available to run, externally coupled, or integrated within a DBMS. There is a need for evaluating their effectiveness and efficiency due to the variety of automatic tuning tools, mainly for relational DBMSs. Either we might want to check for existing features that are mandatory for a given application, or we would like to characterize the pros and cons of implemented heuristics when compared with new ones proposed in the literature. These provide the best-expected results for the database system performance. There exist specific tuning tasks that might have automatic support in a given RDBMS, but not others. For example, it is possible to tune database parameters to server hardware with PostgreSQL [PGTune 2019] automatically. Still, it is hard to find tools for automating a variety of physical database design tasks (e.g., creating materialized views or partial indices). It is infeasible for the DBA to, promptly, test and evaluate all possible physical design tuning actions for a given workload, as even index selection is NP-hard [Piatetsky-Shapiro 1983]. Nevertheless, tools to aid in this task exist for other RDBMS [Bruno 2012]. Software reuse is one of the significant goals of software engineering research [Frakes and Kang 2005]. It aims to reduce development and maintenance efforts and, at the same time, improve software quality due to the use of already designed software units, implemented, validated, and tested [Peters and Pedrycz 2000] [Sommerville 2011]. We observe that database automatic tuning tools share a core of common tasks, even though they might suggest different tuning action types. All self-tuning systems rely on basic tasks such as workload capture, data extraction, SQL parsing, or DBMS catalog access. Hence, this research area could take advantage of software reuse. However, reuse opportunities to support the comparison and evaluation of database self-tuning tools have not yet been systematically explored. Aiming at taking a step towards bridging this gap, this paper describes how we may instantiate a reuse-oriented framework to support the evaluation of self-tuning tools for relational database systems. This framework reduces the effort of building a specialized database application by identifying a typical common architecture and enabling cus- tomization through well-defined framework hot-spots that may implement specific tuning strategies. In pursuit of this aim, this paper makes the following contributions: 1. A suggested list of reuse-oriented requirements for database self-tuning tools where we discuss possible modeling options and justify our choices; 2. A component-based architecture of a framework to create self-tuning applications for relational databases with its corresponding open-source implementation; 3. Three customized automatic database tuning tools extended from our framework kernel, employing strategies using combinations of different tuning actions (indexes, partial indexes, and materialized views) for different RDBMS; 4. An evaluation of how effective our self-tuning tools are and how much they can increase the throughput of a well-known benchmark. Based on our results, we conclude that the framework enabled instantiating useful self-tuning tools empolying different strategies, which significantly reduced query execu- tion costs on different RDMS, with low effort. Hence, we believe that it can serve as a valuable common and extensible testbed for further evolving self-tuning strategies. Our framework is open-source and is available at https://github.com/hidden-for- double-blind-review All data, queries, and selected tuning actions, besides results ob- tained, are available at hidden-for-double-blind-review for experimental reproducibility. The remainder of this paper is organized as follows. Section 2 presents concepts about software reuse and database tuning. In Section 3, we propose a list of reuse-oriented requirements for database self-tuning tools. We discuss possible modeling options and justify our choices. It also puts forward our software framework to support building database self-tuning tools. Section 4 depicts the evaluation process and experimental results for database self-tuning tools built using our framework. Section 5 describes the related work and discusses database self-tuning tools. Finally, Section 6 brings our con- cluding remarks. 2. Background Software reuse is the use of existing software or software knowledge to develop new software. Reusable assets can be either reusable software or software knowledge. It aims to reduce development and maintenance costs and improve software quality due to the use of already designed, implemented, validated and tested software units [Sommerville 2011]. Reusable software components are self-contained, clearly identifiable artifacts that describe or perform specific functions and have clear interfaces, appropriate docu- mentation, and a defined reuse status [Sametinger 1997]. Components are ideally known as software pieces that can be reused in many software systems [Alvaro et al. 2007]. A framework is a reusable application that can be specialized to produce custom applications. Frameworks aim at reusing software units divided according to the requirements of the final software [Fayad et al. 1999]. Typical characteristics of software frameworks include a reusable, semi-complete application that can be specialized to produce custom applications or can act as an application generator directly related to a specific domain [Fayad and Schmidt 1997]. The points of the flexibility of a framework are called hot-spots, which are typically abstract classes or methods that must be implemented. If one wants to generate an application based on a framework, it is necessary to implement a specific code for each hot-spot. Some features of the framework are not mutable and constitute the framework kernel, also called frozen-spots. Database tuning is the process of making a database application run more quickly. This usually means higher throughput, and it may also mean lower response time for some applications [Shasha and Bonnet 2002]. The process of database tuning is an activity where the Database Administrator (DBA) monitors the database, plans possible tuning actions, evaluates and executes them on the database system, and, again, monitors the database to observe the impact on the workload. Self-tuning tools automate this process by delegating steps performed by the DBA to autonomic mechanisms able to follow algorithms and heuristics to monitor and execute tuning actions. Performance is the main optimization goal of the tuning activity. The literature considers different optimization goals, such as availability, fault tolerance, and resource consumption. In this paper, we present a framework to support developing self-tuning tools that can target any of these optimization opportunities. Several reasons motivate database self-tuning systems: (i) the increasing complexity of multi-tenant monolithic applications and services; (ii) the growing complexity of RDBMS’s administration and tuning, which provides hundreds of tuning knobs; (iii) the high cost of ownership for RDBMS-based solution, where hiring experts in system tuning and management is dominant [Chaudhuri and Weikum 2005]. There are many tuning action types defined in the literature. In this work, we focus on the manipulation of three access structures that may achieve better performance for database systems: indexes, partial indexes, and materialized views. We will explore the action of creating these structures with a closer look into multiple DBMSs, exploring our reuse-oriented framework solution. 3. Requirements and Framework Architecture Overview There is a comprehensive amount of literature surrounding

Load more