The RLR-Tree: a Reinforcement Learning Based R-Tree for Spatial Data

The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data Tu Gu1, Kaiyu Feng1, Gao Cong1, Cheng Long1, Zheng Wang1, Sheng Wang2 1School of Computer Science and Engineering, Nanyang Technological University, Singapore 2DAMO Academy, Alibaba Group [email protected],{kyfeng,gaocong,c.long,wang_zheng}@ntu.edu.sg,[email protected] ABSTRACT space filling curve), and then learn the CDF for a particular dataset. Learned indices have been proposed to replace classic index struc- Despite the success of these learned indices in improving the perfor- tures like B-Tree with machine learning (ML) models. They require mance of some types of queries, they still have various limitations, to replace both the indices and query processing algorithms cur- e.g., they can only handle spatial point objects and limited types rently deployed by the databases, and such a radical departure is of spatial queries, some only return approximate query results, likely to encounter challenges and obstacles. In contrast, we pro- and they either cannot handle updates or need a periodic rebuild pose a fundamentally different way of using ML techniques to to retain high query efficiency (Detailed discussions are in Sec- improve on the query performance of the classic R-Tree without tion 2). These limitations, together with the requirement that the the need of changing its structure or query processing algorithms. learned indices need a replacement of the index structures and Specifically, we develop reinforcement learning (RL) based models query processing algorithms currently used by spatial database sys- to decide how to choose a subtree for insertion and how to split a tems, would make it less likely for learned indices to be deployed node, instead of relying on hand-crafted heuristic rules as R-Tree immediately in the current database systems. and its variants. Experiments on real and synthetic datasets with In this work, we consider a fundamentally different way of using up to 100 million spatial objects clearly show that our RL based machine learning to improve on the R-Tree, rather than learning a index outperforms R-Tree and its variants. CDF for spatial data. We aim to use machine learning techniques to construct an R-Tree in a data-driven way for better query efficiency. PVLDB Reference Format: Specifically, instead of relying on hand-crafted heuristic rules for Tu Gu1, Kaiyu Feng1, Gao Cong1, Cheng Long1, Zheng Wang1, Sheng ChooseSubtree and Split operations, which are two key operations 2 Wang . The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial for building a R-Tree, we propose to learn from data and train Data. PVLDB, 14(1): XXX-XXX, 2020. machine learning models for the two operations. Note that we do doi:XX.XX/XXX.XX not need to modify the basic structure of the R-Tree and thus all the currently deployed query processing algorithms will be still 1 INTRODUCTION applicable to our proposed index. This would make it easier for the To support efficient processing of spatial queries, such as range learning based index to be deployed by current databases. queries and KNN queries, for different application scenarios, such as To motivate our idea, we next revisit the two key operations Goole Maps, and Pok4´Mon Go with hundreds of millions of objects ChooseSubtree and Split. When inserting a new spatial object, an and a large number of user queries, spatial databases have for R-Tree needs to invoke the ChooseSubtree operation iteratively, decades relied on delicate indexes. The R-Tree [15] is arguably the i.e., choosing which child node to insert the new data object, until most popular spatial index that prunes irrelevant data for queries. a leaf node is reached. If the number of entries in a node reaches R-Trees have attracted extensive research interests [2–4, 6, 8, 13, 17, the capacity, the Split operation is invoked to divide the entries 18, 23, 24, 27, 29, 31–33, 35, 37, 40, 42] and is widely used in popular into two groups. Many R-Tree variants have been proposed, which databases such as PostgreSQL and MySQL. mainly differ in their strategies for the insertion of new objects The learned index has been proposed in [21], which proposes (i.e., ChooseSubtree) as well as algorithms for splitting a node (i.e., arXiv:2103.04541v1 [cs.DB] 8 Mar 2021 a recursive model index (RMI) for indexing 1-dimensional data by Split). Almost all these strategies are based on hand-crafted heuris- learning a cumulative distribution function (CDF) to map a search tics. However, there is no single heuristic strategy that dominates key to a rank in a list of ordered data objects. To address the lim- all the others in terms of query performance. To illustrate this, we itations of the RMI, such as the lack of supporting updates, and generate a dataset with 1 million uniformly distributed data points improve it, several learned indices have been proposed based on and construct four R-Tree variants using four different Split strate- the RMI. The idea of learned indices is also extended for spatial gies, namely linear [15], quadratic [15], Greene’s [14] and R*-Tree data [24, 31, 40] or multi-dimensional data [6, 8, 27]. They usu- [3]. We run 1,000 random range queries and rank the four indices ally map spatial data points to a uniform rank space (e.g., using a based on the query processing time of each individual query. We observe that no single index has the best performance for all the This work is licensed under the Creative Commons BY-NC-ND 4.0 International License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of queries. For example, Greene’s Split has the best query performance this license. For any use beyond those covered by this license, obtain permission by among 50% of the queries while R*-Tree Split is the best for 49% emailing [email protected]. Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment. of the queries. The observation that no single index dominates Proceedings of the VLDB Endowment, Vol. 14, No. 1 ISSN 2150-8097. the others holds on datasets with other distributions and real-life doi:XX.XX/XXX.XX Tu Gu1, Kaiyu Feng1, Gao Cong1, Cheng Long1, Zheng Wang1, Sheng Wang2 datasets as well, although the top performers may be different on In summary, we make the following contributions: different datasets. (1) We propose to train machine learning models to replace The observation motivates us to develop machine learning mod- heuristic rules in the construction of an R-Tree to improve on its els to decide how to choose a subtree for insertion and how to split query efficiency. To the best of our knowledge, this is the first work a node, instead of relying on hand-crafted heuristic rules. Further- that uses machine learning to improve on the R-Tree without mod- more, we observe that the ChooseSubtree and Split operations can ifying its structure, so that all currently deployed query processing be considered as two sequential decision making problems. There- algorithms are still applicable and the proposed index can be easily fore, we model them as two Markov decision processes [30] (MDPs) deployed by current databases. and propose to use RL to learn models for the two operations. (2) We model the operations of how to choose a subtree for However, it is a very challenging journey to make the idea work. insertion and how to split a node as two Markov decision processes, The first challenge is how to formulate ChooseSubtree and Split and carefully design their states, actions and reward signals. We as MDPs appropriately? How should we define the states, actions, also present some of our unsuccessful trials of designing. and the reward signals for each MDP? Specifically, 1) Designing (3) We design an effective and efficient learning process that the action space. A naive idea could be to define an action as one of learns good policies to solve the MDPs. The learning process enables existing heuristic strategies. However, the idea does not work, for us to apply our RL models trained with a small dataset to build an which we observed from the experiments that different strategies R-Tree for up to 100 million spatial objects. would often make the same decision, and thus it leaves us less room (4) We conduct extensive experiments on both real and synthetic for improvement. Some other possible ideas include defining larger datasets. The experimental results show that our proposed index action spaces, but then it would increase the difficulty of training achieves up to 95% better query performance for range queries and the model. 2) Designing the state. As the number of entries varies 96% for KNN queries than the R-Tree and its variants built using across different R-tree nodes, it is challenging to represent the state one-by-one insertion methods. of a node for both operations? 3) Designing the reward signal. It is nontrivial to design a function that evaluates the reward of past 2 RELATED WORK actions during model training to encourage the RL agent to take “good” actions. 2.1 Spatial Indices The second challenge is how we can use RL to address the de- Spatial indices are generally classified into three categories, namely fined MDPs? For instance, in the construction of an R-Tree, node data partitioning based, space partitioning based, and mapping overflow (and thus the Split operation) occurs less frequently than based. We further divide the data partitioning based methods into the ChooseSubtree operation.

Load more