Duplicate-Sensitivity Guided Transformation Synthesis for DBMS Correctness Bug Detection

Duplicate-sensitivity Guided Transformation Synthesis for DBMS Correctness Bug Detection Yushan Zhang Peisen Yao Department of Computer Science and Engineering Department of Computer Science and Engineering HKUST HKUST Hong Kong, China Hong Kong, China [email protected] [email protected] Rongxin Wu Charles Zhang Department of Cyber Space Security Department of Computer Science and Engineering School of School of Informatics HKUST Xiamen University Hong Kong, China Xiamen, China [email protected] [email protected] Abstract—Database Management System (DBMS) plays a core skills [4] and the hand-written test cases only cover a very role in modern software from mobile apps to online banking. It is small part of the possible input query space [5], leading to critical that the DBMS provides correct data to all applications. insufficient testing of the DBMS. In what follows, we refer to When the DBMS returns incorrect data, a correctness bug is triggered. Current production-level DBMSs still suffer from a test case for the purpose of DBMS correctness testing as a insufficient testing due to the limited hand-written test cases. test query and an expected result set. Recently several works proposed to automatically generate many Recent studies [6]–[10] have proposed several approaches test cases with query transformation, a process of generating an to better explore the query space. They all use automated equivalent query pair and testing a DBMS by checking whether random query generation to enlarge the explored query space. the system returns the same result set for both queries. However, all of them still heavily rely on manual work to provide a However, the key challenge is the test oracle problem, i.e., transformation which largely confines their exploration of the how to find the expected result set of a given query. To address valid input query space. the problem, one category of approaches such as ADUSA [6] This paper introduces duplicate-sensitivity guided transforma- and Qex [7] convert the query to Satisfiability Modulo Theory tion synthesis which automatically finds new transformations by (SMT) constraints and use a constraint solver to deduce the first synthesizing many candidates then filtering the nonequivalent ones. Our automated synthesis is achieved by mutating a results. However, they cannot work on special data types (e.g. query while keeping its duplicate sensitivity, which is a necessary varchar) in the DBMS and big tables because of the perfor- condition for query equivalence. After candidate synthesis, we mance issue of the solver. Another category of approaches keep the mutant query which is equivalent to the given one identify a correctness bug by comparing the result set of a pair by using a query equivalent checker. Furthermore, we have of equivalent queries. However, existing approaches can only implemented our idea in a tool Eqsql and used it to test the production-level DBMSs. In two months, we detected in total use limited rules/transformations to derive the query pairs. For 30 newly confirmed and unique bugs in MySQL, TiDB and example, TLP [10] first generates a query Q and then uses a CynosDB. predefined transformation to on Q to obtain a new equivalent arXiv:2107.03660v1 [cs.SE] 8 Jul 2021 Index Terms—DBMS, database system, SQL, testing query Q0. Next, it checks whether the DBMS generates the same result under the input of Q and Q0. However, since some I. INTRODUCTION correctness bugs can only be exposed under a certain query Database Management Systems (DBMSs) are widely used pair and the predefined transformation may fail to generate in modern industries. The users expect the DBMS to retrieve such query pair, these approaches would lose the chances to correct data records and report correct analysis results on the detect such bugs. To be more specific, the transformation rule giving databases. Unexpectedly, researchers could still find of the existing studies [1], [8], [10] presumes that the generated hundreds of queries where the popular and production-level query pair should explicitly include a “where” predicate (e.g. DBMSs return wrong result sets [1], though these systems where a != 0) and the bug was triggered by the inconsistent have been extensively tested during their development [2], [3]. interpretations of “where” predicates. However, as shown in To validate the correctness of different DBMS components Figure 1, this bug is exposed by two equivalent queries QA (e.g. the optimizer), the developers usually manually write test (Fig 6b) and QB (Fig 6c) in which there are no “where” cases and provide the expected result for a query. However, the predicates. Essentially, this bug is caused by an inconsistent quality of test cases is largely limited by developers’ testing representation of the float-point value 0:001. QA and QB are expected to return the same value on the given table t0, but is duplicate sensitive and MAX is duplicate insensitive. they have a different value for 0:001 as shown in the result Additionally, WHERE and HAVING are duplicate sensitive column. and the DISTINCT operator is duplicate insensitive. The key point in identifying such a correctness bug in Due to the limited space, we show how our approach works Figure 1 is to compare if a row appears in the result sets of the with two common query categories: SELECT-FROM-WHERE mysum1 mysum2 equivalent query pair. In this example, and and SELECT-FROM-GROUP BY. Let us start with the first do not have a same row, and then this bug is exposed. Inspired category, and then move to the more complex category with by this example, our key idea is to create such a query that aggregations. Figure 2 shows the categories, the corresponding a row appears the same number of times in the result sets seed query and the generated query mutant. The gray color of both the original and transformed queries. To achieve this highlights where the transformation takes place. goal, it is required to describe the property whether duplicated rows should present in a query’s result set, which we denote as III. CATEGORY 1: SELECT-FROM-WHERE query duplicate sensitivity. This property is generally available This category is one of the most frequent queries in the on all SQL queries, which does not require the query to have DBMS, which asks the system to filter out data records not a specific component such as the “where” predicate. Although satisfying the “where” predicate. These queries will go through we can create query mutants by preserving this property, it complex equivalent transformations during the optimization can generate queries which are not equivalent to the given phase [15], potentially resulting in a nonequivalent final result one because it is a necessary condition. To keep the valid set due to implementation bugs [16], [17]. However, such bugs query mutants only, we leverage the existing query equivalence are hard to be captured due to the large amount of optimization checker [11] to do the job. combinations [16]. Previous work NoREC [8] can only detect Following our insight, we designed an automated cor- such bugs because it only manipulates the “where” predicate rectness bug detection tool with duplicate-sensitivity guided to create an unoptimized query mutant. query transformations. Eqsql could synthesize many candidate For queries in this category, there are many candidate mutant transformations for a query and it uses the query transformations to apply. In this example, our approach inserts equivalent solver EQUITAS [11] to identify the valid mu- additional operators to the seed query. By adding DISTINCT tants. To evaluate the effectiveness of our approach, we have to SELECT a in (Figure 2), it generated QA. And by adding implemented it in a tool called Eqsql. We ran Eqsql on several groups with GROUP BY, it gets QB. Both inserted operators real-world production-level DBMSs, namely MySQL [12], remove duplicates of an a value and keep only one copy of PingCAP TiDB [13] and Tecent CynosDB [14]. During a each different a value. Because adding duplicates for an a two month evaluation, our tool has found 30 confirmed and value or remove some duplicates will not change the result unique bugs. It detected 14 bugs in MySQL, 13 bugs in TiDB for DISTINCT and GROUP BY, they are duplicate insensitive. and 3 bugs in Tencent CynosDB. While we evaluated our The operators de-duplicate the input tuples and output results tool on MySQL-compatible database systems, we believe our with one copy for each row. We denote this operation as δ, approach is also applicable to the other DBMSs. following the convention in [18]. In summary, we made the following contributions: IV. CATEGORY 2: SELECT-FROM-GROUP BY • We present a workflow for synthesizing equivalent- For this category, we consider a query with grouping and an preserving query transformations, based on duplicate sen- aggregate function. Different from the first category, queries sitivity of SQL clauses. of this category always produce an intermediate result [19], • We implemented our approach in Eqsql, which is a fully which is a set of distinct groups formed by the GROUP automated testing tool for detecting correctness bugs in BY columns. A GROUP BY operator splits the input records MySQL-compatible database systems. into groups based on the given grouping condition. After a • We detected 30 new and unique bugs in the popular grouping, normally an aggregate function is used to compute production-level DBMSs, as a practical evaluation of some results on a group, such as a sum of a numeric column. Eqsql. However, we could not extract the groups explicitly with any SQL operators, which makes it harder to discover such bugs II. ILLUSTRATIVE EXAMPLES only with the final result set.

Duplicate-Sensitivity Guided Transformation Synthesis for DBMS Correctness Bug Detection

Towards a Library for Deterministic Failure Testing of Distributed Systems

Go in Tidb Yao Wei | Pingcap About Me

Architecting Cloud-Native NET Apps for Azure (2020).Pdf

Using Rust to Build a Distributed Transactional Key-Value Database Liutang | [email protected] About Me

Towards a Library for Deterministic Failure Testing of Distributed Systems

Cloud Container Engine

Sunhao | Pingcap Agenda

Automatically Detecting and Fixing Concurrency Bugs in Go Software Systems

Replicating Mysql Data to Tidb for Near Real-Time Analytics

Running Databases in a Kubernetes Cluster

Contributions to Large-Scale Distributed Systems the Infrastructure Viewpoint - Toward Fog and Edge Computing As the Next Utility Computing Paradigm? Adrien Lebre

LSM-Tree Database Storage Engine Serving Facebook's Social Graph