The Art of Balance: a Rateupdb Experience of Building a CPU/GPU Hybrid Database Product

The Art of Balance: A RateupDBTM Experience of Building a CPU/GPU Hybrid Database Product Rubao Lee Minghong Zhou Chi Li Rateup Inc. Rateup Inc. Rateup Inc. Chengdu Sichuan, China Chengdu Sichuan, China Chengdu Sichuan, China [email protected] [email protected] [email protected] Shenggang Hu Jianping Teng Dongyang Li Rateup Inc. Rateup Inc. Rateup Inc. Chengdu Sichuan, China Chengdu Sichuan, China Chengdu Sichuan, China [email protected] [email protected] [email protected] Xiaodong Zhang The Ohio State University Columbus OH, USA [email protected] ABSTRACT 1 INTRODUCTION GPU-accelerated database systems have been studied for more than In the past decade, GPU-accelerated database systems have grown 10 years, ranging from prototyping development to industry prod- up rapidly from an early stage of concept proofing (e.g. [22]), in- ucts serving in multiple domains of data applications. Existing GPU augurating and prototyping (e.g. GDB [48], GPUDB [115], Ocelot database research solutions are often focused on specific aspects [53]) and many individual projects in the academic community in parallel algorithms and system implementations for specific fea- (e.g, [18, 28, 44, 61, 78, 90, 95, 98, 99, 101, 105, 113]), to the stage of tures, while industry product development generally concentrates making products in database industry for large-scale deployments on delivering a whole system by considering its holistic perfor- for various critical applications. Several commercial GPU DBMSs mance and cost. Aiming to fill this gap between academic research have been built, either by GPU-customized designs, e.g. Kinetica and industry development, we present a comprehensive industry [6], OmniSci [9] (formerly known as MapD [84]), SQream [10], product study on a complete CPU/GPU HTAP system, called Ra- or by extending existing database systems, e.g. the PostgreSQL- teupDB. We firmly believe “the art of balance" addresses major based Brytlyt [2] and HeteroDB [4]. The design space for a GPU issues in the development of RateupDB. Specifically, we consider database is large, where many optimizations can be done to of- balancing multiple factors in the software development cycle, such fer multiple possibilities to improve performance. However, the as the trade-off between OLAP and OLTP, the trade-off between system improvement in many research projects is not necessarily system performance and development productivity, and balanced achieved for the overall performance, but for isolated cases. Based choices of algorithms in the product. We also present RateupDB’s on our development experience of RateupDB, a high performance complete TPC-H test performance to demonstrate its significant GPU database product, we demonstrate that the essential issue is to advantages over other existing GPU DBMS products. achieve an overall system balance by trading-off multiple important performance factors. Only in this way, we are able to achieve high performance, scalability and sustainability for a GPU database. PVLDB Reference Format: Rubao Lee, Minghong Zhou, Chi Li, Shenggang Hu, Jianping Teng, 1.1 The Rise of GPU DBMSs Dongyang Li, and Xiaodong Zhang. The Art of Balance: A RateupDBTM The rise of GPU database systems is mainly driven by real-world ap- Experience of Building a CPU/GPU Hybrid Database Product. PVLDB, plication requirements. As the human society has entered the data 14(12): 2999-3013, 2021. computing era, which is driven by increasingly more and diverse ap- doi:10.14778/3476311.3476378 plications supported by new technologies and innovative concepts in the society, such as mobile computing, digital content sharing, shared economy and others, new application requirements are often beyond the scope of functionalities provided and optimized by This work is licensed under the Creative Commons BY-NC-ND 4.0 International conventional database systems. Based on our collaborations with License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of data industry practitioners and customers, we summarize the new this license. For any use beyond those covered by this license, obtain permission by emailing [email protected]. Copyright is held by the owner/author(s). Publication rights requirements into the following three categories. licensed to the VLDB Endowment. Requirement for Real-Time Business Insights: This is the Proceedings of the VLDB Endowment, Vol. 14, No. 12 ISSN 2150-8097. doi:10.14778/3476311.3476378 most important application need for users who need to obtain instant business insights by applying various analytics on timely 2999 updated data (e.g., [33, 39, 106, 119]). A typical scenario is the real- Quadro RTX 8000 with 48GB GDDR6 memory for $4,997 and EVGA time vehicle position management as represented by the cases GeForce RTX 3090 WITH 24GB GDDR6 memory for $2,499) have of AresDB used in Uber [5] and Kinetica used in USPS [12]. We already provided a memory price at around $100 Cost/GB – even have learned from a major global service company owning several if we only count the final price to the memory. part The com- millions of operating vehicles that dynamic route planning and modity GPU development gives us a promising landscape, and we deviation rate analytics based on real-time vehicle positions are can optimistically state that GPU’s computing power will be best not only necessary to lower operational costs and improve user utilized with fast data accessing in large memory space. experiences, but also vital for drivers and passengers to benefit from on-time malicious behavior detection for criminal precaution. 1.2 Targeted Issues in Building a GPU DBMS Requirement for Extreme Computing Power: The explo- Although many academic research efforts have been made for opti- sive trend to process an increasingly huge amount of non-structural mizing GPU database processing, (e.g, [18, 28, 44, 61, 90, 95, 98, 99, data seriously challenges the limited computing power in CPU-only 101, 105]),they often focused on a single aspect, without giving a database systems, particularly in this post-Moore’s Law era. We complete reference for building a real database product. Here are have witnessed a number of daily production cases that demand ul- the three issues we attempt to address in this paper. tra high-performance by hardware-accelerated solutions to process The Issue of Algorithm Choices: Various algorithms (e.g., unconventional and huge data sets in relational databases, which hash join vs sort merge join) are available, but how to make a right are critically important for applications including but not limited selection among them is a key issue for an industry product. to DNA sequencing [14, 80], spatial data analytics [16, 35, 110], The Issue of Simple Data Analytics: In existing GPU database and 3D structure analysis [79, 96]. We have surprisingly observed literature, we often find that performance evaluations are conducted that users still prefer to use a database product to manage their by SSB-like (Star Schema Benchmark) workloads. However, an data, although they are often unsatisfied with database system’s industry product demands intensive performance evaluations by performance for their tasks. This is because databases have an ad- complex data analytics workloads, such as TPC-H queries that vantage of providing simple interfaces for users. If they use multiple requre much more practical SQL features [11]. fractional software tools, they may have to manually manage them. The Issue of Isolated Environments: In existing GPU data- Requirement for One-Stop Data Management: With the de- base literature, we often find that the query performance is evalu- mand of real-time data analytics (from simple aggregations to ated in an isolated environment, assuming there is no companion complex statistics analytics), users often demand to use a one- data modifications. However, daily database operations in practice stop solution to unify data writes, query processing, and machine are unavoidably involved with both reads and writes. learning. Compared to the commonly used two-stage solution (e.g., RocksDB+Spark [29, 118]), a single database (e.g. [5][12]) can sig- 1.3 Contributions nificantly save users’ cost in usage and management, and canavoid This paper aims to address the above mentioned issues by present- data transfers between separate systems. Other examples are Ter- ing the basic structure, design choices and performance insights into adata Vantage [71] and Google BigQuery cloud service [1] that RateupDB. Attempting to fill the gap between academic research embed machine learning functionalities into their relational query and product development, we make the following contributions. execution engines; thus the need of using a separate machine learn- Making Right Choices by Taking Balanced Considerations: ing system for the in-database data is removed. However, the re- We introduce a set of macro-level design choices for implementing quirements of performance isolation and service-level agreement RateupDB, and why we make these choices. We present implemen- bring new challenges for the implementation of one-stop databases. tation techniques for several important parts, especially on how to Rapid Hardware Advancement: The rapid development of combine query executions and transaction processing. hardware technology allows GPUs to provide increasingly parallel Performance Insights into RateupDB by Complex Data computing power and large memory space. With the NVLink pro- Analytics in a Production Environment: We present RateupDB’s tocol [8], a single server can provide a fast-accessing memory pool performance for a complete TPC-H test: (1) by using the official of several hundreds of GBs to feed GPU cores for data processing. benchmark specification and (2) by strictly satisfying the testre- Therefore, such a hardware advancement trend is creating a GPU- quirements. Our comprehensive evaluation results confirm the Style In-Memory Computing environment, which has appeared in value of the system balancing principle in RateupDB.

The Art of Balance: a Rateupdb Experience of Building a CPU/GPU Hybrid Database Product

Oracle Vs. Nosql Vs. Newsql Comparing Database Technology

The Declarative Imperative Experiences and Conjectures in Distributed Logic

The Best Nurturers in Computer Science Research

SQL Vs Nosql

DATA BASE RESEARCH at BERKELEY Michael Stonebraker

Arxiv:2106.11534V1 [Cs.DL] 22 Jun 2021 2 Nanjing University of Science and Technology, Nanjing, China 3 University of Southampton, Southampton, U.K

Looking Back at Postgres

Operating Systems and Middleware: Supporting Controlled Interaction

THE DESIGN of POSTGRES Abstract 1. INTRODUCTION 1

Readings in Database Systems, 5Th Edition (2015)

BIBTEX Meets Relational Databases Dedicated to the Memory of Edgar Frank “Ted” Codd (1923–2003) and James Nicholas “Jim” Gray (1944–2007) Nelson H

In-Network Support for Transaction Triaging