Towards a Representative Benchmark for Time Series Databases

CONFIDENTIAL UP TO AND INCLUDING 03/01/2017 - DO NOT COPY, DISTRIBUTE OR MAKE PUBLIC IN ANY WAY Towards a representative benchmark for time series databases Thomas Toye Student number: 01610806 Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck Counsellors: Dr. ir. Joachim Nielandt, Jasper Vaneessen Master's dissertation submitted in order to obtain the academic degree of Master of Science in de industriële wetenschappen: elektronica-ICT Academic year 2018-2019 ii CONFIDENTIAL UP TO AND INCLUDING 03/01/2017 - DO NOT COPY, DISTRIBUTE OR MAKE PUBLIC IN ANY WAY Towards a representative benchmark for time series databases Thomas Toye Student number: 01610806 Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck Counsellors: Dr. ir. Joachim Nielandt, Jasper Vaneessen Master's dissertation submitted in order to obtain the academic degree of Master of Science in de industriële wetenschappen: elektronica-ICT Academic year 2018-2019 PREFACE iv Preface I would like to thank my supervisors, Prof. dr. Bruno Volkaert and Prof. dr. ir. Filip De Turck. I am very grateful for the help and guidance of my counsellors, Dr. ir. Joachim Nielandt and Jasper Vaneessen. I would also like to thank my parents for their support, not only during the writing of this dissertation, but also during my transitionary programme and my master's. The author gives permission to make this master dissertation available for consul- tation and to copy parts of this master dissertation for personal use. In all cases of other use, the copyright terms have to be respected, in particular with regard to the obligation to state explicitly the source when quoting results from this master dissertation. Thomas Toye, June 2019 Towards a representative benchmark for time series databases Thomas Toye Master's dissertation submitted in order to obtain the academic degree of Master of Science in de industriele¨ wetenschappen: elektronica-ICT Academic year 2018{2019 Supervisors: Prof. dr. Bruno Volckaert, Prof. dr. ir. Filip De Turck Counsellors: Dr. ir. Joachim Nielandt, Jasper Vaneessen Summary As the fastest growing database type, time series databases (TSDBs) have experienced a rise in database vendors, and with it, a rise in difficulty in selecting the best one. TSDB benchmarks compare the performance of different databases to each other, but the workloads they use are not representative: they use random data, or synthesized data that is only applicable to one domain. This dissertation argues that these non-representative benchmarks may not always accurately model real world performance, and instead, representative workloads should be used in TSDB benchmarks. In this context, workloads are defined as consisting of data sets and queries. Workload data sets can be categorized using eight parameters (number of metrics, regularity, volume, data type, number of tags, tag value data type, tag value cardinality, variation). A new benchmark was created, which uses three representative workloads next to a baseline non-representative workload. Results of this benchmark show significant performance differences for data ingestion speed for complex data, latency and maximum request rate (when broad time ranges are used), and storage efficiency of data points when comparing representative and non-representative workloads. The results show that existing benchmarks may not be accurate for real world performance. Keywords Time series database, representative benchmarking, load testing Towards a representative benchmark for time series databases Thomas Toye Supervisor(s): Bruno Volckaert, Filip De Turck Abstract— As the fastest growing database type, time series databases ison by being easily extensible to competing solutions that solve (TSDBs) have experienced a rise in database vendors, and with it, a rise in comparable problems. 4. Scalable: Benchmarks must be able to difficulty in selecting the best one. TSDB benchmarks compare the performance of different databases to each other, but the workloads they use are measure performance in a wide range of scale. Not just single-n- not representative: they use random data, or synthesized data that is only ode performance, but also cluster configurations. 5. Verifiable: applicable to one domain. We argue that these non-representative bench- Benchmarks should be repeatable and independently verifiable. marks may not always accurately model real world performance, and in- 6. Simple: Benchmarks must be easily understandable, while stead, representative workloads should be used in TSDB benchmarks. In this context, workloads are defined as consisting of data sets and queries. making choices that do not affect performance. Workload data sets can be categorized using eight parameters (number of Existing TSDB benchmarks were evaluated, a summary is metrics, regularity, volume, data type, number of tags, tag value data type, shown in Table II. Two gaps in the state of the art are clear: cur- tag value cardinality, variation). A new benchmark was created, which uses three representative work- rent benchmarks insufficiently test TSDB performance at scale, loads next to a baseline non-representative workload. Results of this bench- and current benchmarks are not representative or only represen- mark show significant performance differences for data ingestion speed for tative for a single use case. The data used is either random, or complex data, latency and maximum request rate (when broad time ranges synthetic; real world data are not used. This begs the question: are used), and storage efficiency of data points when comparing representative and non-representative workloads. The results show that existing are results of a non-representative benchmark generalizable to benchmarks may not be accurate for real world performance. real world performance? Keywords— Time series database, representative benchmarking, load testing I. INTRODUCTION IME SERIES DATABASES provide storage and interfac- ing for time series. In its simplest form, time series data T Representative Revelant Portable Scalable Verifiable Simple are just data with an attached timestamp. This subtype of data For IoT has seen increasing interest in the last decade, especially with TS-Benchmark the rise of the Internet of Things, which produces time series for use cases everything from temperature to sea levels. Other areas where IoTDB-benchmark time series are used are the financial industry (e.g. historical analysis of stock performance), the DevOps industry (e.g. cap- TSDBBench ture of metrics from a server fleet) and the analytics industry For financial (e.g. tracking ad performance over time). FinTime Finding the best database to use is not an easy task. Eighty- use cases For DevOps three existing TSDBs were found by Bader et al. [1]. To deter- influxdb-comparisons mine the best one, benchmarks are used. However, these bench- use cases marks may not be representative of the use case or industry the TABLE I TSDB is needed for, which makes their results difficult to gen- EVALUATION OF EXISTING TSDB BENCHMARKS eralize. In this abstract, we will first analyze existing TSDB benchmarks. Then, a new benchmark is proposed, which compares representative workloads to non-representative workloads. The III. BENCHMARK COMPONENTS results of this benchmark are analysed to A new benchmark is developed to compare benchmark performance between representative and non-representative work- II. EVALUATION OF EXISTING BENCHMARKS loads. Workloads consist of a workload data set that is loaded Chen et al. [2] consolidate the properties of a good bench- into the TSDB and a workload query set that executes upon it. mark as follows: 1. Representative: Benchmarks must simulate real world conditions, both the input to a system and the sys- A. Data set tem itself should be representative of real world usage. 2. Rel- Time series data sets have the following properties in com- evant: Benchm arks must measure relevant metrics and tech- mon: data arrives in order, updates are very rare to non-existent, nologies. Results should be useful to compare widely-used so- deletion is rare, and data values follow a pattern. lutions. 3. Portable: Benchmarks should provide a fair compar- They differ on the following characteristics: : Data points are organizaed in metrics, which can be Baseline Financial Rating IoT • Metrics compared to tables in relational databases. Metrics 1 6 1 7 : In regular time series, data points are spaced evenly Regularity Regular Semi-reg. Irregular Regular • Regularity in time. Irregular time series do not emit data points regularly. Volume Low Low Low Low Irregular time series are often the result of event triggers. Tags 2 1 5 0 : High volume time series may emit hundreds of thou- Tag value 10,000 7,164 20M 0 • Volume sands of data points a seconds, while low volume time series cardinality only emit one event a day. Variation High Low High Low : Traditionally, values of data points in a time series • Data type Total data 20M 74.4M 20M 14,5M have been integers or floating point numbers. But they can also points be booleans, strings or even custom data types. License NA CC0 Custom CC-BY-4 Tags: A time series data point may have one or more tags asso- • TABLE II ciated with the timestamp and value. There may be no tags or OVERVIEW OF WORKLOAD DATA SETS a lot of tags. Tags may hold special values, such as geospatial information. : The number of possible combinations • Tag value cardinality the tag values make. Three tags with two possible values each set uses historical stock market information, the rating data set make a tag value cardinality of six. uses movie reviews and the IoT data set is produced by power : While time series data usually follow a pattern, the • Variation information for a house. variation in a series may be very different. One series may describe a flat line, while another may describe seasonal variations V. EVALUATION with daily spikes. A. Storage efficiency B. Query set Figure 1 shows relative storage efficiency. The size in bytes Bader et al.

Towards a Representative Benchmark for Time Series Databases

Time Series Database (TSDB) Query Languages

Time Series Management Systems: a Survey

Grafana Is an Open Source Visualization and Monitoring Tool That Is Used for Creating Dashboards and Charting Time Series Data

Graphite Documentation Release 1.2.0

Survey of Time Series Database Technology Version: 1.0.0 Date: 30 March 2020

Comparative Analysis of Time Series Databases in the Context of Edge Computing for Low Power Sensor Networks

Suitability of Influxdb Database for Iot Applications

Time Series Databases and Influxdb

Survey and Comparison of Open Source Time Series Databases

Tools for Big Data Analysis

Comparison of Time Series Databases

Time Series Database in Industrial Iot and Its Testing Tool