Bachelor Degree Project aha oaso,JntnRöör Jonatan Johansson, Mathias 2020 Spring ECTS 30 Technology level Information Basic in Project Degree Bachelor MariaDB SQL vs TiDB and Sharded NewSQL between comparison Performance xmnr aieAtif Yacine Márki Examiner: András Supervisor:

Abstract Databases are used extensively for websites and a large amount of websites are built upon Wordpress. Wordpress uses MySQL compatible databases and for many larger websites it can be imperative to have the best possible performance. Recently, NewSQL databases have been appearing that combine the features of NoSQL databases with SQL compatibility and ACID compliance that is usually not found in NoSQL databases. Due to their recency, there is a knowledge gap in the literature regarding NewSQL databases. Therefore, this work compares the performance of the NewSQL database TiDB against the SQL database MariaDB in a performance benchmark. The benchmark includes three testing approaches with an aim of testing multiple performance aspects. These include load testing, complex queries and performance in a realistic environment.

Results from this thesis show that TiDB achieves better average response time for complex queries and in load testing where the databases and load is large but gets worse results for simple queries and small datasets. MariaDB performs better when used with a web server and with tests that involve write-operations.

Keywords: Database, Performance comparison, NewSQL, TiDB, SQL, MariaDB ​

1 Table of contents 1 Introduction 3

2 Background 4

2.1 Databases 4

2.2 SQL 4

2.3 NoSQL 5

2.4 NewSQL 6

2.5 TiDB 8

2.6 Wordpress 9

3 Problem 10

3.1 Problem background 10

3.2 Problem description 10

3.3 Aim 10

3.4 Research questions 11

3.5 Objectives 11

3.6 Hypotheses 12

4 Method 13

4.1 Empirical Strategies 13

4.2 Related research 13

4.3 Approach 15

5 Experiment 17

5.1 Benchmarking tools and relevant variables in the domain 17

5.1.1 Sysbench 17

5.1.2 Siege 18

5.1.3 Database queries 18

5.2 Benchmarking setup 19

5.3 Results 20

2 5.3.1 Sysbench 20

5.3.2 Database queries 22

5.3.3 Wordpress - Siege load testing 23

5.4 Analysis & Conclusions 24

6 Discussion 26

6.1 General discussion 26

6.1.1 Documentation and community 27

6.1.2 TiDB write performance 27

6.2 Research usefulness 27

6.3 Ethics and Validity threats 28

6.3.1 Ethics 28

6.3.2 Validity threats 28

6.4 Future work 29

7 References 32

3 1 Introduction Websites that require persistent storage usually use databases to store data such as user information and product catalogs. As the number of users increases on a website it becomes necessary to improve the server architecture to keep up with demand. The web server layer can be improved by adding more hardware resources to the web server, called vertical scaling or by adding new web servers, called horizontal scaling. More resources can be added to the database server as well to improve performance, however adding new database servers introduces a number of problems.

NoSQL databases are able to make use of sharding to split their databases across several physical locations referred to as nodes, improving query response times and redundancy. NoSQL databases also do not enforce strict table structure and can get better response times than traditional databases even on single nodes. Győrödi, Győrödi, Pecherle & Olah (2015) compared the relational SQL database MySQL to the NoSQL MongoDB and found that MongoDB gave lower execution times for all operations tested. However NoSQL databases also require the website to be built around the NoSQL database. This makes it difficult to move existing websites to non-relational NoSQL databases. NoSQL databases also often cannot guarantee full ACID compliance (Grolinger, Higashino, Tiwari, Capretz 2013).

NewSQL databases amend this by providing full ACID compliance and SQL interfaces while still allowing for horizontal partitioning. TiDB is one of these NewSQL implementations that provides a MySQL compatible SQL interface and sharding capabilities to potentially increase scalability and general performance. Many studies have been conducted that compare NoSQL databases to SQL databases but NewSQL databases have not been thoroughly compared to SQL databases. This thesis compares TiDB to MariaDB to see how they compare in performance and see when it might make sense to use a NewSQL database over a traditional SQL database. This study consists of a series of benchmarks that test various different use cases and load configurations to see when each database gets the lowest response time. The Wordpress website framework is used to evaluate how well the databases would work with a real website.

4 2 Background

2.1 Databases Almost everything today that needs to store a large amount of data uses a database of some kind. Whether it is an online store that uses a database for storing products and customer information or a bank that stores account balance, databases can be used to store all kinds of data. There are a multitude of different database implementations that have different characteristics and features and the database management systems can be controlled through specific programming languages and APIs. Most database implementations can be categorized into one of the following groups: SQL, NoSQL and recently NewSQL.

2.2 SQL Structured Query Language (SQL) is used for managing structured data in relational database management systems. Elmasri & Navathe (2011) describes the relational model as a representation of the database as a collection of relations. A relation can be represented as a table of data. Each relation consists of a number of attributes defining what values each row will contain as well as what datatype will be used to store them. The actual definition for the relational model is based in mathematics and uses set theory and first-order predicate logic to reason about databases.

A common feature of relational databases is to support transactions. Transactions group together several queries into a single unit of work that is either performed completely or not at all. Using transactions, a system can avoid the situation of an operation being half-completed because a query failed in the middle of the operation (The PostgreSQL Global Development Group 2020). ACID is a set of properties that database transactions should possess in order to guarantee validity under all circumstances. These properties are Atomicity, Consistency preservation, Isolation and Durability (Elmasri & Navathe 2011).

● Atomicity means that a transaction should always be performed entirely or not at all. ​ In practice this means that in case a transaction fails, e.g in the case of the system crashing during the query, the database must recover from and remove any trace of the transaction from the database.

● Consistency preservation, that a transaction should take the database from a consistent ​ state to another, without interference from other transactions.

● Isolation, that transactions should appear to be executed in isolation and not have any ​ interference with other transactions executing concurrently.

5 ● Durability, that all changes to the database must persist and that no data should be lost ​ due to failure.

Some notable SQL databases are MySQL and MariaDB:

● MySQL is an open source relational database management system originally created by MySQL AB. Following a series of acquisitions, MySQL changed ownership to first Sun Microsystems and then Oracle Corporation.

● MariaDB is a MySQL compatible database that was forked from MySQL in 2010 after the Oracle acquisition. Its lead developer is one of the core developers behind MySQL and the project is being developed with an emphasis on providing a high level of compatibility with MySQL.

Both MySQL and MariaDB can be ACID compliant depending on which storage engine is being used. The default storage engine in both databases provide ACID compliance. (Oracle Corporation 2020).

SQL is one of the most popular database types, but there are others like NoSQL where different implementations focus on better query performance or include specific features not usually found in SQL.

2.3 NoSQL Non-relational databases, also called NoSQL are mainly thought of as dissimilar to SQL databases but can also be referred to as “not only SQL” since it is possible for a NoSQL database to support query languages like SQL. In comparison to traditional relational databases like SQL, NoSQL can provide different storage mechanisms for data. Relational SQL databases use a tabular structure which can be read and understood without additional explanations (Meier & Kaufmann 2019). NoSQL databases however, utilize different storage strategies that do not necessarily have a common or easily understandable structure for how data is stored.

The categorization of NoSQL databases is based on the type of data storage mechanism they employ. Meier & Kaufmann (2019) describes that there are four general strategies NoSQL databases use: Key-value stores, Document Stores, Column stores and Graph databases.

● Key-value databases are one of the simplest NoSQL types as data is simply represented with a key-value pair where a unique key points to a specific value in the database. This structure resembles a hash table in function and does not enforce any particular structure in the value fields.

● Document stores are based on the key-value structure but save data into documents such as JSON, BSON, XML and YAML. This means that each document can store many fields, similar to how a relational database would store data. Different

6 documents are free to contain different fields from each other, in contrast with relational databases where each row needs to contain all the fields that its table has defined.

● Column stores are similar to key-value databases except a single key can link to multiple columns and a variable number of columns can exist in a single record.

● Graph databases use graphs where edges and nodes represent relations between values and objects in the database.

Further differences from SQL is that NoSQL databases often compromise full ACID compliance unlike MySQL or MariaDB for improved performance and flexibility, which translates to reduced execution time for queries and expanding the possibility of what features can be implemented into the database (Gupta, Tyagi, Panwar, Sachdeva & Saxena, 2017). Consistency is often replaced with the concept of “eventual consistency” where newly written data may take time to propagate but will eventually be written to all nodes. However, some fully ACID compliant NoSQL databases do exist.

Compromising ACID compliance can result in better performance and feature set in comparison to SQL databases. But in security critical or business oriented systems, reliability and consistency can be important aspects. NewSQL is another database variety that aims to combine the strengths of both NoSQL and SQL.

2.4 NewSQL NewSQL is a more recent term for databases that try to combine the scalability of NoSQL databases while maintaining the ACID guarantees for transactions (Pavlo & Aslett 2016). A feature that NewSQL databases often use is horizontal partitioning, in order to improve scaling without requiring any additional application logic.

Partitioning tables into several smaller tables is one way of achieving better performance in a database when the size of the tables become very large (Microsoft 2020). There are two main techniques for dividing large database tables, vertical and horizontal partitioning. Vertical partitioning divides and relocates specific columns of a table into another table. Typically, the rows that are moved to a separate table may contain a large data field such as a long text comment which could dramatically decrease the performance of queries (Elmasri & Navathe 2011). Horizontal partitioning divides a database table by splitting it by rows into several smaller tables, reducing the size of each individual table. Partitioning database tables can result in better query performance as the table size is reduced. Another way of increasing query performance is to add more resources, which is called scaling.

7

Figure 1: Vertical and Horizontal scaling

Scaling databases can be done in two ways, vertically and horizontally (Michael, Moreira, Shiloach & Wisniewski 2007). Vertical scaling means that additional resources are added to already existing servers to achieve better performance. More RAM, better CPU or faster storage are examples of hardware that could be added to servers to achieve better performance. Horizontal scaling however, means that more servers are added so that data can be distributed over multiple physical servers that could be at different locations. The difference is illustrated in Figure​ 1 where the blocks illustrate server resources at the same location vertically and spread out horizontally. Horizontal scaling for databases can be implemented in many different ways, one of which is with sharding.

Sharding is essentially horizontal partitioning where tables in the database are partitioned into smaller subtables and then divided, or horizontally scaled across separate servers. This allows for increased performance as each specific server only has to search through a subset of the total database (Drake 2019). It can also make the processing of larger queries easier as big queries can be divided into multiple small queries that are executed in parallel on separate servers.

Sharding can be implemented in multiple different configurations, it is often called a shared nothing architecture where data is horizontally partitioned between multiple nodes. Each node only contains a fragment of the data and that data only exists on that node, so there is no redundancy. The lack of redundancy becomes a problem from a reliability perspective since if a single node disappeared data would be lost. A more traditional replication scheme would be to copy all data to all nodes. This allows for both data redundancy and load balancing for read requests. However, since each node contains all of the data, write requests to the database may require synchronization across all nodes (Storti 2017).

8 An alternative is to split the data over some, but not all nodes to get the advantages of both methods. This makes it possible to achieve both reliability and load balancing with a lower performance impact for write operations as only a couple nodes require coordination. An example of a database that utilizes this replication scheme is TiDB.

2.5 TiDB The database TiDB is based on NewSQL which allows it to provide similar scalability to NoSQL while still maintaining the ACID guarantees of relational databases (Pingcap 2020). TiDB uses NoSQL strategies for data storage but also provides MySQL compatibility which allows the database to be accessed using MySQL syntax even though the database structure is non-relational. The non-relational data structure of TiDB allows for horizontal partitioning across multiple nodes, which can be beneficial for handling large amounts of data.

TiDB consists of three main services. The TiDB service, which receives and processes SQL database requests. The TiKV service which stores data and the Placement Driver service that holds keys to the respective values in the database so that the TiDB service can access data from TiKV (Pingcap 2020).

The TiDB service receives SQL queries from an application and requests data from TiKV through the Placement Driver. The TiDB service receives the location of the requested data from the Placement Driver and then directly fetches it from TiKV and returns the result from the database query.

The data stored in TiKV is divided into regions where each region has a specific key range. TiKV uses the RocksDB database for data storage and the Raft protocol for coordinating between nodes (TiKV Project Authors 2020). RocksDB is based on Google’s LevelDB database and provides some ACID guarantees (Facebook Database Engineering Team 2020).

The Placement Driver stores keys to the region locations in TiKV in order to keep track of where data is located. The Placement Driver also manages clusters in addition to handling the scheduling and load balancing of the TiKV service.

Each service consists of a number of processes running on separate nodes that makes load balancing possible. The TiDB service is distributed over multiple nodes and queries can be load balanced between processes so that multiple queries can be executed simultaneously. The Placement Driver service also runs on multiple nodes but only the leader placement driver instance, determined by the Raft consensus algorithm will handle operations while the rest only exist to replace the leader if it crashes. The TiKV service is also distributed over multiple nodes which makes it possible for the database to take advantage of the replication scheme where data is divided over some but not all nodes. This replication scheme provides data redundancy as each instance of the database on separate nodes holds multiple, but not

9 the same regions of data and each region is duplicated over some but not all nodes to achieve data redundancy.

2.6 Wordpress This study looks at Wordpress as an example of a popular Content Management System (CMS) for writing websites. CMS’s provide easy-to-use interfaces for creating websites. Estimating the exact number of websites and the CMS they use is difficult, but according to W3Techs (2020) 36.1% of the 10 000 000 most visited websites use Wordpress. This makes Wordpress one of the most popular CMS as the second entry is Joomla with 2.4%. Wordpress is written in PHP and was made to use the MySQL database. Using databases as part of a website CMS like Wordpress can impose a different type of workload compared to other applications. As the number of websites using Wordpress is large, database performance for Wordpress workloads is important.

10 3 Problem

3.1 Problem background

Wordpress is one of the most popular CMS (W3Techs 2020) and uses MySQL compatible relational databases that are primarily able to scale vertically by adding more resources to the database node. Horizontal scaling is possible using for example master-slave replication, but it inhibits drawbacks such as performance issues.

NoSQL databases can make use of sharding where data is divided between nodes so that the same data only exists on a few nodes. This can result in better performance than other replication schemes but is subject to omitting data consistency guarantees and ACID compliance. NewSQL databases however, combine the strengths of both SQL and NoSQL by being able to make use of replication schemes like sharding and are able to incorporate ACID compliance and SQL interfacing, which many NoSQL databases do not. NewSQL databases could thereforere act as a replacement for traditional relational databases and provide better scalability while still supporting the SQL interface.

Sharding should allow for larger databases and higher server load while retaining similar performance. Sharding is however inherently more complicated to implement than a single server relational database based on SQL. The performance may suffer because of the overhead that comes with sharding the database and there are several layers of services that have to communicate in order to answer a query. Moreover, data is divided across several servers which can require coordination between servers to collect all the data for a query.

3.2 Problem description

When developing a website using Wordpress, it is unknown if a sharded NewSQL database can achieve better performance compared to an SQL database.

3.3 Aim The aim of this study is to investigate if a NewSQL database like TiDB can achieve better performance than an SQL database like MariaDB, both in general and when used with Wordpress.

This will help answer the question of whether or not NewSQL is able to overcome some of the drawbacks that are observed in relational database systems with large databases. The architecture of TiDB comes with a substantial performance overhead in comparison to other SQL databases. However, depending on the size of the dataset and the types of queries that are used, this architecture might allow TiDB to perform better in certain scenarios. Therefore, both performance and how the performance scales will be looked at in this study.

11 Performance will be measured as the average response time for queries and scaling is how the performance changes and compares in different testing scenarios.

Fatima & Wasnik (2016) have observed better performance for the NewSQL database VoltDB in comparison to NoSQL and SQL for an internet of things workload, but there is not much other research comparing NewSQL with other database types. However, others have compared NoSQL and SQL databases with varying results. Patil, Hanni, Tejeshwar & Patil (2017) have observed better performance for the NoSQL database MongoDB with sharding compared to MySQL. Győrödi et al. (2015) also observes better performance for MongoDB compared to MySQL, but without sharding.

Generally, there is not much research comparing NewSQL against SQL or NoSQL databases with an exception of Fatima & Wasnik (2016) and even less targeting a website use case. The research by Patil et al. (2017) and Győrödi et al. (2015) indicates the potential performance that could be achieved with NewSQL databases using sharding, as the architecture and functionality they offer is similar to NoSQL databases such as MongoDB. Fatima & Wasnik (2016) tested another NewSQL database with favourable results, but how this specific database implementation of NewSQL performs is yet unknown.

Because of the wide adoption of Wordpress for websites, the database systems will be evaluated both in isolation and when used in a larger web stack running a Wordpress website. This will ensure that the databases are tested in an environment comparable to a real use case and give confidence that the results seen when testing databases in isolation will also appear outside the testing environment.

3.4 Research questions RQ1: How does sharded TiDB compare to MariaDB in terms of performance for queries directly against the database?

RQ2: How does sharded TiDB compare to MariaDB in terms of performance for Wordpress queries?

3.5 Objectives The objectives to achieve the aim and a description for each objective are as follows:

1. Gather knowledge about the domain

Research the problem area and find methods that can test different aspects of performance.

2. Gather data about pure databases 3. Gather data about Wordpress stack

Determine what variables may affect performance and establish an approach for comparing the databases. Then compare the databases using the established approaches with a focus on

12 the relevant variables. Objective 2 will look at the databases directly whereas objective 3 will look at the databases when used in a Wordpress install.

4. Analyze and discuss results

Present the results from the comparison and analyze how they contrast to the research questions and hypotheses.

3.6 Hypotheses It is likely that there will not be just a single faster database out of MariaDB and TiDB. Instead one might perform better under certain queries and database loads but worse in other circumstances. The hypotheses are written to accommodate this.

1. MariaDB should achieve better performance for small databases and less complex queries due to the overhead of TiDB’s architecture.

2. TiDB should scale better than MariaDB as databases grow as each node only has to search a subset of the total database.

3. TiDB should scale better than MariaDB as database load increases since the database requests can be spread out over several nodes.

4. TiDB should perform better for complex queries compared to MariaDB since several nodes can cooperate to complete queries.

13 4 Method

4.1 Empirical Strategies

This study will compare the performance between SQL and NewSQL by using MariaDB and TiDB. The method chosen for this performance benchmark is an experiment. An experiment gives greater control over variables that could affect the result and makes avoiding external factors easier (Wohlin et al. 2012).

There are two different methods that were considered as alternatives to an experiment, surveys and case studies. A case study could give more realistic results as a specific scenario is observed in its real life context. However, the results may be difficult to generalize outside the study as they can only be applied to the specific scenario where they were observed. This also applies to experiments, but in comparison to case studies the surrounding variables can be controlled which makes the experiment more reproducible.

The other considered alternative was a survey. Using a survey this study could have been performed by asking people who had used both databases about the differences between them and when one might be preferred over the other. A survey could also have more realistic results as participants would have practical knowledge of both databases (Wohlin et al. 2012). However this would not have been feasible to perform as it would require finding participants who have experience with both MariaDB and TiDB. Especially TiDB would be difficult to find participants for as it was released comparatively recently.

4.2 Related research

Fatima & Wasnik (2016) has performed a comparison between the NewSQL database VoltDB with both the NoSQL database MongoDB and the SQL database MySQL. Their comparison focused on how the database handled a large internet of things workload. This workload contained a single client with single write, read and delete operations in addition to multi clients with the same setup. VoltDB shows good results for write operations, better than both MongoDB and MySQL. For read operations, VoltDB performs much better than the other two and when increasing the number of clients for both workloads, VoltDB retains the performance lead.

Győrödi el al. (2015) performed a comparative study between MySQL and the NoSQL database MongoDB with the purpose of determining if it would be appropriate to switch a specific system from MySQL to MongoDB. The comparison includes how the database structure differs for storing data, how queries are performed and various implementation difficulties between the two. The databases are also compared in a performance test where response time for different query types are tested. The performance tests consist of single

14 queries of varying complexity that were timed to assert the performance difference. They conclude that MongoDB performs better than MySQL in all tests and that the flexibility of MongoDB could be beneficial towards their specific situation.

Li & Manoharan (2013) compared the performance of an SQL database against multiple NoSQL databases. Their comparison focused on the performance of instantiate, read, write and delete operations between an SQL and various NoSQL databases based on different storage implementations. They arrive at the conclusion that the performance is dependent on the specific database and whether the tested query is a read or write operation. They do not see any correlation between the different storage implementations and performance, yet some NoSQL databases achieve better performance than the SQL database whilst others are slower.

Patil et al. (2017) also compared the performance of MySQL and MongoDB but with a focus on load balancing by utilizing the sharding capabilities of MongoDB. The comparison was carried out for a web application that handles user data, where the two databases are compared by inserting a varying amount of data. They conclude that MongoDB achieves better performance than MySQL for all of their tests.

Kaur & Sachdeva (2017) compared the performance of multiple NewSQL databases. Their comparison focuses on the execution time and latency of read, write and update requests for big data management workload, in addition to exploring specific features of each database implementation they test. They conclude that one of the four databases they test, NuoDB achieves better results overall but that additional research is required to evaluate the performance comprehensively. Testing the databases with multiple nodes, exploring security vulnerabilities and testing in more use-case oriented scenarios are mentioned as interesting aspects.

Tongkaw & Tongkaw (2016) compared the performance of MariaDB and MySQL by utilizing the benchmarking software Sysbench and OLTP-Bench. The comparison contains two datasets with different amounts of data that was tested using the benchmarking software.​ They conclude that MySQL achieves better performance than MariaDB for their workload.

Most of the related research has performed normal database queries of varying complexity with both read and write tests. However, with the exception of Tongkaw & Tongkaw (2016) and Fatima & Wasnik (2016), there are few tests comparable to a realistic workload with multiple clients or multithreaded load testing. NoSQL specific features such as sharding have only been tested in a lower quantity as well. The work by Patil et al. (2017) tests sharding, but only the performance of insert queries for a specific NoSQL database. A single query type does not cover all aspects that a database can be used for. Kaur & Sachdeva (2017) compares multiple NewSQL databases and mentions that future work could explore sharding capabilities of the databases. The conclusion from this is that there is not much testing on how NewSQL databases with sharding like TiDB perform.

15 The main difference with the experiment in this thesis compared to the related research is that most of the mentioned deficiencies will be addressed in some aspect. In order to do as extensive testing as possible the proposed method will include standard database queries, load testing and web server benchmarking performed in a realistic environment.

Standard database queries have been widely used in the related research and can be used as a good indicator for the performance of specific queries. Li & Manoharan (2013) note in their study that they did not test any complex operations, and that operations with additional complexity could have had a great impact on their results. A range of both easier and complex queries with and without join operations could be better for testing performance, especially since the NewSQL implementation will make use of sharding which could allow for further optimizations in more complex queries.

The load testing performed by Tongkaw & Tongkaw (2016) makes use of two datasets but with vague details on the exact size. The databases and the number of clients in Fatima & Wasniks (2016) comparison is also unknown. The scale of databases used in the study can have a large impact on the response time of queries since a larger database naturally takes longer to search through. From a database comparison point of view it is important to test databases of different sizes since the implementation can have a large effect on the scaling properties of the database system.

Synthetic load testing is able to simulate a simple workload with multiple clients instead of just single queries performed in series. Testing like this could better simulate real use-cases where several users are attempting to use a website at the same time. However, unless the database is actually tested as part of a real webstack it could be difficult to know with confidence how it will perform in real life. Therefore, it would be beneficial to also test the performance for a real dataset and real queries, instead of just randomly generated data.

4.3 Approach

Benchmarking is split into three approaches, load testing, standard queries with varying complexity and testing a complete web-stack on a real website. Load testing assessed how the databases handle a large number of users at the same time. The complex query benchmark tested how well the databases handle singular large queries that may need to access several different tables to complete. The final web-stack benchmark load tested the databases when used as part of a real website. This tested the databases with queries that a real website might use.

In order to achieve more generalizable results, the first two benchmarks were performed directly against the databases without a web-server layer and will correspond to the first research question. This makes it possible to test general performance for different situations and hopefully give results that will generalize to use-cases not directly looked at in this study. The first benchmark ran simple queries against each database with a varying number of

16 simulated users and varying sizes of databases while measuring average response time. This tested how each database performs as load and dataset size changes but does not test how the databases handle larger queries.

The second benchmark specifically tested larger queries one-by-one to see how well each database can optimize a single more complex query. A “complex” query for the purposes of this study is a query that needs to either search through unindexed columns in large tables or requires joins between tables. Both of these operations can potentially take a long time to complete with large tables.

The third benchmark tested the databases as part of real websites, which corresponds to the second research question. This can help give confidence in the previous results by seeing if the general performance differences between the databases are seen when the databases are used in a web stack. The benchmark consisted of repeated page loads with a varying number of clients where complete page download time was measured. Static website resources such as images and javascript files were excluded from the download time as they do not involve the database.

17 5 Experiment

5.1 Benchmarking tools and relevant variables in the domain This chapter describes the three benchmarking approaches.

5.1.1 Sysbench Various tools are used to execute the experiment. On the database side the Sysbench tool is used as it allows for parallel execution of a large number of queries through one of several built-in OLTP benchmarks (Kopytov 2020). The queries can be executed against any number of tables of a user-specified size. The tables are created by Sysbench and filled with randomly generated data. Tables are of the same structure and the data is of the same data type but the content is varied. The Sysbench queries select, update or delete data from a random row in the table and they all consist of variations on “SELECT/UPDATE/DELETE​ x FROM y WHERE id=z​”. Each query should execute quickly as they only search on the indexed primary key column of the table. Sysbench can simulate both a read only and read/write workload, the read/write workload includes a lower number of UPDATE queries in addition to the SELECT queries that simulate an OLTP many-reads few-writes workload.

The Sysbench benchmark is executed on both databases and uses three different sizes of databases, consisting of 10 000, 1 000 000 and 100 000 000 rows of data respectively. Load is simulated by varying the number of threads between 1, 10 and 100 threads that send database requests. The number of threads are how many concurrent queries Sysbench will send to the database. Both read-only and a combination of read-write queries are tested for every database size and thread configuration, since the databases can respond differently to the two types of queries.

Sysbench uses events where a single event consists of a set number of queries and the benchmark runs until the specified number of events has been reached. Each event generates 20 queries divided into 14 read, 4 write and 2 other queries. Each configuration executes 1000 events, so the multithreaded tests divide the number of events between the threads and do not create 1000 events for each thread. This results in a total of 20 000 queries for the read and write configurations and 16 000 queries for the read-only configurations. Each configuration is executed 10 times in order to minimize variance in results between the tests. This results in 200 000 total queries for the read-write and 160 000 queries for the read-only tests where each query is random. The Sysbench benchmark returns the response time for each event, that is the time it takes to complete all the queries which is used for evaluating the performance of the databases.

18 5.1.2 Siege Website load testing is performed using Siege. Siege is an open source program meant to simulate several users browsing a website at the same time. It does this by creating a number of threads that each repeatedly send GET requests to the specified page (Jeff Fulmer 2020). The number of simulated users and time to run for is specified by the user. By default Siege also requests images and other static resources like JavaScript libraries for the page that is being loaded. This is disabled for the study as the response time of requests for static resources would not be affected by differences in the database used.

The Wordpress load testing is run on both databases and consists of loading a single page multiple times with a simulated number of users. The webpages for this benchmark come from a real website supplied by the company Duva AB. There are three web pages that are used for the benchmark. The web pages are the frontpage of the website, a single product page and the shopping cart page. The frontpage involves loading the webpage through Wordpress where a small amount of products have to be loaded from the databases in addition to other data that Wordpress requires. The product page loads a single product and related products. The shopping cart page involves a number of Wordpress requests but no actual products have to be loaded from the database. Each page is tested with 1, 5 and 25 threads that simulate load and each test is repeated 250 times. The measured output variable for this test is the total time from sending the request to receiving the response for each request.

5.1.3 Database queries The database queries benchmark do not use any special software for executing the benchmark. Instead, raw SQL queries are run through a MySQL client and the reported execution time is stored in a log file. Three queries have been chosen for this test:

1. SELECT * FROM wp_postmeta LIMIT 10; 2. SELECT COUNT(*) FROM wp_postmeta; 3. SELECT post_title FROM wp_posts, wp_postmeta WHERE ID=post_id AND meta_key='total_sales' AND post_type='product' ORDER BY meta_value LIMIT 10;

The first query simply returns 10 rows from the database in no specific order. This query is expected to execute quickly since the database can simply return any 10 rows. This tests the databases in a similar way to the sysbench queries in that database architecture overhead is expected to play a large role in the results. The second query counts the number of rows in a large table. This requires the database to iterate through the entire table. The third query requires a join between tables and selects values based on non-indexed column values. This is the query with the most clauses out of all tested and should give an idea of how each database handles larger queries. Each query will be executed 1000 times against the websites database.

19 5.2 Benchmarking setup Databases are set up on a server cluster consisting of 3 separate nodes. All nodes have identical hardware and each has two Xeon X5670 CPUs running at 2.93GHz. Each CPU has 6 cores and can run 2 threads per core. All nodes run CentOS 7 and are equipped with 96 GB DDR3 RAM ​and a Samsung 970 EVO NVMe SSD. ​ Both databases are set up in a containerized environment using . Containers are often used for making deployment of applications easier as they can ensure that the running environment is the same in development as in production (Red Hat 2020). The Docker containers are set up in which can be used for managing applications with multiple containers. Kubernetes can provide load balancing between containers, self healing when containers fail and makes it easy to control many different containers and the communication between them (The Kubernetes Authors 2020). Helm is an application manager for deployment in Kubernetes and is used for setting up both databases as it can facilitate management of separate web stacks (Helm Authors 2020).

Each database is set up in Kubernetes under a separate namespace where a number of pods contains the containerized applications. A namespace is a virtual cluster that separates applications and their resources from other parts of the server cluster. A pod is a group of containers that share the same resources and multiple pods can exist in a single namespace.

The MariaDB setup contains separate pods for the database and each instance of the Wordpress web server. Three Wordpress web servers are utilized for every database setup to avoid the web servers becoming bottlenecks. TiDB has the same Wordpress setup but additional resources for the database setup. The TiDB service that receives database requests contains two pods, each on a separate node to ensure reliability and enable load balancing. The Placement Driver service and the TiKV service is divided into three pods each, separated over different nodes for reliability and replication of data for the TiKV service.

All benchmarks are handled by a script that starts the benchmarking tool or executes the set number of queries for that part of the benchmark. The script is handled by a separate pod within the same namespace as either database stack. As the pod is on the same server cluster, the network latency should be minimal compared to sending queries from a separate computer. The normal database queries and Sysbench load testing will connect directly to the database pod which avoids any latency from the surrounding environment, while the Wordpress queries makes requests through the Wordpress webpage and therefore the complete webstack.

All benchmarks are run without any resource limits on the possible CPU usage for each pod. Kubernetes makes it possible to set CPU limits for each pod but does not allow for configuring limits for an entire namespace. It is possible to set a resource limit for the MariaDB database pods since the database is exclusively handled by one pod. However, the database in TiDB is divided into three separate pods and the load is not always distributed

20 evenly between them. Therefore, one of the pods can occasionally require more processing power than the others. If the same CPU limit were imposed on both MariaDB and TiDB, MariaDB would be able to allocate the CPU resources to a single pod while TiDB has to divide the CPU allocation between all of the TiDB database pods. Dividing the resources results in a third of the processing power for each database pod in TiDB compared to MariaDB, which could restrict TiDB’s performance for certain queries. Therefore both databases are benchmarked without any limits to CPU usage.

5.3 Results This chapter presents the results of the study.

5.3.1 Sysbench

Figure 2: Sysbench read-only results

Figure 2 shows the results from the sysbench read-only benchmarks as a box plot. The graph uses a logarithmic scale for latency and shows the number of threads along the x-axis. Each table size is grouped together with the smallest table of 10 000 rows on the left and the largest with 100 000 000 on the right.

The 10 000 row table and 1 000 000 row table both show the same general result that MariaDB achieves lower latencies than TiDB for these queries. The difference decreases somewhat for 100 threads. The 100 000 000 rows table however, shows that TiDB ended up being less affected by an increase in dataset size than MariaDB and outperformed MariaDB for all thread-configurations.

21

Figure 3: Sysbench read-write results

The sysbench read-write results in Figure​ 3 show again that MariaDB performs better in most scenarios. However, it seems that MariaDB grows more quickly as the number of threads increases than TiDB. The databases get closer to each other when using 100 threads and it is most visible in the 1 000 000 and 100 000 000 rows. TiDB also gets closer to a straight line when graphed while MariaDB seems to increase in gradient with the x-axis. Since the graph uses a log-log scale with number of threads on the x-axis and latency on the y-axis a straight line does not correspond to a linear relationship between x and y. Instead, a straight line would imply some y = ax k relationship where k depends on the inclination of the line.

22 5.3.2 Database queries

Figure 4 Figure 5

Figure 6

Each graph shows the average latency of one query for each database. The results are graphed with a linear y-axis showing the latency in seconds. The MariaDB results in Figure​ 4 seems to indicate that a large number of samples were measured at 0ms of latency. This is because query time was measured in milliseconds which resulted in precision being lost for the first query’s results. The insufficient measurement accuracy experienced in this instance did not affect the other queries. The results as graphed show that TiDB performed slightly worse in the first query but performed significantly better in the other two queries. The second query in particular (Figure​ 5)​ showed a 3 to 4 times difference in latency between the two databases. TiDB performed better in the third query (Figure​ 6)​ as well with less than half the latency of MariaDB. Query 2 and 3 likely let TiDB utilize its distributed design to divide the workload over several nodes, severely reducing latency.

23 5.3.3 Wordpress - Siege load testing

Figure 7 Figure 8

Figure 9

Figure 7,​ Figure​ 8 and Figure​ 9 show the results of the website load testing benchmark. The times are visualized in box plots along a logarithmic y-axis. MariaDB and TiDB get similar results in all configurations but MariaDB always performs better when comparing medians. All pages give similar results to each other and even changing the number of threads does not have a clear effect on relative performance between the databases. MariaDB still performs the best in most measures except possibly spread as TiDB gets a smaller interquartile range and lower total spread in a few configurations.

24 5.4 Analysis & Conclusions

Four hypotheses were formulated focusing on different aspects of how the databases can differ in performance. The first hypothesis is “MariaDB​ should achieve better performance for small databases and less complex queries due to the overhead of TiDB’s architecture” ​ and applies to the Sysbench test. The results show that for simple queries and the smallest database in the Sysbench read-only test, MariaDB always achieved better results compared to TiDB which agrees with the hypothesis. In the read-write test, MariaDB mostly achieves better results for the two smaller databases and also gets very close at the largest load level. MariaDB has a higher spread but the quartile and median values are lower than TiDB. As MariaDB achieves better performance in every test with the smallest database size, the hypothesis is accepted.

The second hypothesis is “TiDB​ should scale better than MariaDB as databases grow as each node only has to search a subset of the total database” and applies to the Sysbench test. ​ The Sysbench read-only tests do show that TiDB scales better than MariaDB as the amount of data increases, however the same effect is not as visible in the read-write test. Therefore this hypothesis was not correct as TiDB did not scale better than MariaDB in some configurations.

The third hypothesis is “TiDB​ should scale better than MariaDB as database load increases since the database requests can be spread out over several nodes” and applies to Sysbench ​ and the website load testing. The results from the website load testing do not show a clear winner in terms of scaling. Both databases get similar relative performance for every configuration. In the sysbench tests however, TiDB seemed to scale better than MariaDB in many of the tests. The difference was most visible in the 100 000 000 rows for both the read-write and read-only tests. In these cases the performance of MariaDB decreased significantly while TiDB was not affected to the same degree. However, the scaling difference was not substantial enough in both Sysbench and the website load testing to conclude that TiDB scales better than MariaDB, therefore the hypothesis was rejected.

The reason the improved scaling in sysbench testing does not appear in the siege tests may be because the siege tests did not place as much of a load on the database as the sysbench tests. The web server benchmarks only tested a maximum of 25 concurrent users compared to the 100 threads of the sysbench tests. The sysbench tests also ran queries against each database non-stop whereas the siege queries were sent by a web server in the process of constructing the requested web page. A web server running Wordpress will not send all queries back-to-back but will instead execute some amount of PHP code in between each database query which gives the database more time to handle other users.

The fourth hypothesis is “TiDB​ should perform better for complex queries compared to MariaDB since several nodes can cooperate to complete queries” and applies to the database ​ queries. The results show that TiDB achieves much better performance than MariaDB for two

25 of the three queries. The first query was expected to run quickly and repeated the sysbench results of MariaDB being faster than TiDB for small queries. The second and third queries both required iterating through whole tables and the results show that TiDB performed significantly better than MariaDB. In these cases, TiDB should be able to utilize the sharded architecture to divide the workload across several nodes which would explain the results. This hypothesis is accepted as the TiDB was expected to perform worse for the first query.

RQ1: “How does sharded TiDB compare to MariaDB in terms of performance for queries ​ directly against the database?” ​ The results from the Sysbench load testing indicate that TiDB can achieve better performance than MariaDB when the load on the database is very high or when the database size is very large. However, this effect is not as strong in the read-write tests. TiDB always performed worse on simple queries or when the databases involved were small. In certain cases TiDB seemed to give more consistent results with reduced variation between queries however this effect was only seen in certain tests and configurations. The place where TiDB performs the best seems to be with large queries and tables where individual queries may take seconds. In these scenarios, TiDB outperformed MariaDB by a factor of 2 or more in query response time for the queries tested.

RQ2: “How does sharded TiDB compare to MariaDB in terms of performance for ​ Wordpress queries?” ​ In a web server environment TiDB consistently performed worse than MariaDB, regardless of load. This indicates that Wordpress runs better with a database that can execute small queries quickly than a database with the overhead of TiDB. However even with many concurrent users TiDB did not manage to beat the performance of MariaDB. From the sysbench tests it seems that TiDB’s architecture makes the most difference when a large dataset is combined with many concurrent users. The web server tests executed in this study show that with the website and load tested, TiDB’s architecture introduces too much overhead to outperform MariaDB in a Wordpress environment.

26 6 Discussion

6.1 General discussion This study compared a NewSQL database to a relational SQL database to find how they compared in performance and scaling both in general and when used with a web server.

Objectives for answering the aim were the following:

1. Gather knowledge about the domain 2. Gather data about pure databases 3. Gather data about Wordpress stack 4. Analyze and describe results

The first objective involved finding research related to the problem area, getting familiar with and setting up an environment for the experiment and establishing the databases in the environment. The second and third objectives involved finding benchmarking approaches for both databases and Wordpress and then executing the benchmark using the approaches. The fourth objective involved analyzing the results from the benchmark and arriving at a conclusion to the research questions and how they relate to the hypotheses.

The aim of this study was to investigate if a NewSQL database like TiDB can achieve better performance than an SQL database like MariaDB, both in general and when used with Wordpress. The results from the study indicate that both databases have strengths and weaknesses. MariaDB performed best with individual small queries and databases in addition to more consistent results for read and write workloads. TiDB performed the best for complex queries, large databases and high load, but is not as consistent for read and write workloads and didn’t perform as well for wordpress. Therefore, it is possible for a NewSQL database like TiDB to achieve better performance than MariaDB, but only for these specific scenarios and workloads.

There are many studies that try to evaluate the performance of various databases however most either compare databases of the same type or compare SQL databases to NoSQL databases. Kaur & Sachdeva (2017) compared various different NewSQL databases and found that NuoDB and MemSQL performed the best out of the 4 tested NewSQL databases. Many of the studies that compare SQL to NoSQL find that NoSQL performs the best in general, for example Győrödi et al. (2015) compared MySQL to the NoSQL database MongoDB in the context of application development and found that MongoDB had the best performance. Győrödi et al. (2015) and Kaur & Sachdeva’s (2017) results differ from the results in this study in that there was a clear winner in terms of both performance and ease of use for application development. Fatima & Wasnik (2016) however, compared the performance of NewSQL database VoltDB against MongoDB and MySQL. Their results are similar to this study as VoltDB seems to scale better with a larger number of clients. VoltDB

27 however, also seems to perform equally well in write workloads as read which was not the case in this study. The NewSQL database looked at in this study only performed better in certain situations which makes it more difficult to make a general recommendation for what database to use.

6.1.1 Documentation and community

This study focused on the performance differences between the databases looked at but there are many other points to consider when deciding between databases. Documentation and community activity are also important points to consider. Both TiDB and MariaDB provide documentation that covers what is necessary to set up and use each database in various environments. However the databases were found to differ in community support. When comparing the number of results in searches for both databases on Stack Overflow, MariaDB returned 500 results while TiDB only returned 152 results (Stack Exchange Inc). MariaDB’s results might be a low count however as many MariaDB users are likely to name their Stack Overflow questions after MySQL since it is very similar to MariaDB but has a longer history and more users (Solid IT 2020). Searching Stack Overflow for MySQL returned over 500 000 results. This indicates that it might be more difficult to find outside help for TiDB if the documentation is not enough.

6.1.2 TiDB write performance When evaluating the performance differences between the databases in Sysbench it became clear that TiDB did not perform as well as MariaDB when it came to write-queries. On the read-only test TiDB managed to beat MariaDB once the table size reached 100 000 000 rows but on the read-write test TiDB did not perform as well. This is believed to be because of the replication built into TiDB that MariaDB lacks. In order to ensure that data is not lost in the case of node failure, TiDB needs to send new data to other nodes before it can complete a write-query. Since MariaDB only runs on a single node it avoids this delay. A fairer comparison in this case would be to compare TiDB against a database that uses, for example, a Master-slave replication scheme where data is replicated across nodes.

6.2 Research usefulness This study is targeted towards companies and others that currently use SQL databases like MariaDB or MySQL for both Wordpress and other use cases, but to a smaller extent. The conclusions hopefully show in which workloads a NewSQL database can be beneficial compared to a SQL database. However, the number of users that was simulated in this experiment might not transfer directly to real workloads with the same number of users. But the results from this study can give an indication of the strengths and weaknesses of the databases and whether or not they are worth considering for a general type of workload.

28 6.3 Ethics and Validity threats This chapter explores some research ethics and validity threats that affect the study.

6.3.1 Ethics This is a database study that focuses on the performance of the tested databases. This study should therefore not have any direct impact on other people. Databases have very broad applications and are used everywhere from smartphones to the weapons industry. The results of this study can too be used for many different purposes however the study was performed solely in the interest of comparing different approaches to database design in the context of web servers.

The data that was used in the Wordpress and database-queries test comes from a database that has been in use by an actual company. The dataset can potentially contain sensitive information to either individuals or the company. Therefore only data on the response times and query performance will be published. No data from the content of the database will be disclosed. This may affect the reproducibility of the study as the exact dataset will not be included in the report.

TiDB is developed by a Chinese company in comparison to MariaDB which has roots in an American company. This is not a problem in itself but there are worries that Chinese tech companies are used by the Chinese government to collect data on people. This could affect how companies view this new database and how alternative databases are perceived.

The source code for both of the databases, all benchmarking tools and other software used in this benchmark is open source and available on either github or on the projects own website.

Material for the report is either referenced or written by the authors. In addition to the benchmarking tools and software referenced, a number of scripts were written to help execute the experiment.

6.3.2 Validity threats Wohlin (2012) describes a number of validity threats that can be applicable to an experiment. A validity threat applicable to this experiment is reliability of measures. The benchmarking measurements are affected to some degree by random chance due to both the complexity of the databases, but also due to being run on server clusters alongside other software. Random factors were handled by focusing on average values after collecting measurements multiple times. Another applicable validity threat is random irrelevancies in experimental setting. It is possible that unrelated software was running on the server during benchmarks, affecting the results. This could have unfairly reduced the performance of the database that was running. A countermeasure has been to execute the same benchmark multiple times and communicate with the company about when the server cluster needed to be available.

29 Another validity threat is ambiguity about direction of causal influence. Unknown factors can affect database performance in a way that influences the benchmarking results. Sysbench random data creation may have affected performance since it created different randomized tables for each database. The TiDB and MariaDB web stacks could contain some differences that affected the performance, other than the databases. This was managed by repeating the benchmarks to see if the results were consistent, and monitoring performance during benchmark to see if performance ever inexplicably changed.

Another validity threat is restricted generalizability across constructs. During this study, both databases were allowed to use as much CPU as they needed. This could mean that one of the databases achieved better performance simply because it could make better use of the extra CPU power it was given. It may also have been the case that while they achieved similar performance, one of the databases had a much higher CPU usage than the other. Since Kubernetes does not allow for limiting total CPU usage across several processes, both databases were simply allowed as much CPU time as they needed.

The TiDB documentation (2020) states that the same region of data should typically be distributed over three nodes. However, it is not explicitly described how distribution works for exactly three nodes, which is the lowest recommended number for nodes for a cluster and the number used in this study. One of the main features with sharding is that data is replicated over some nodes but not all. According to the documentation, data would in this case be replicated over all three nodes which could have an impact on the performance for TiDB.

The database query part of the benchmark consisted of three different queries with varying complexity. These queries were able to give an idea of how the databases perform for queries with increased complexity. However, the queries in this benchmark only tested complexity to a certain degree that might be insufficient for drawing conclusions that apply to all types of complex queries.

Tongkaw & Tongkaw (2016) performed load testing with Sysbench using 100, 500 and 1000 threads. This study however used 1, 10 and 100 threads for Sysbench and 1, 5, 25 for Siege. The numbers used by Tongkaw & Tongkaw (2016) were considered too high for this use case as queries already began to time out these maximum values. The number of threads in both Sysbench and Siege was not based on any research, which could have had an effect on the conclusions as different load levels might have changed the results. Another side effect could be that the results are not generalizable to the same extent as the testing values were not reflective of a real workload.

6.4 Future work This work has focused on how databases perform when used for a website and similar use case workloads. There are multiple types of workloads relevant for databases such as OLAP that is characterized by more complex queries in a smaller quantity and OLTP, with more simple queries in a larger quantity. The results indicate that TiDB performs much better in

30 large scale OLAP workloads that can take advantage of the benefits of sharding. Further studies could be done to compare TiDB’s performance to other databases regarding OLAP workloads.

The benchmark focuses on the two databases TiDB and MariaDB in order to compare the performance of NewSQL and SQL. There are however many other databases that could be used for the same comparison. TiDB utilizes sharding and it is compared to MariaDB because it provides replication and potentially better performance over MariaDB. Future studies could include other NewSQL databases with either, different sharding implementations or other features as many NewSQL databases are untested. Different SQL databases could also be explored as they also have replication schemes available such as MariaDB cluster that utilizes a master-slave replication scheme.

This comparison tested TiDB with three different nodes, between which data could be distributed and replicated. However, TiDB has the potential of better performance as it supports more than the three nodes that were tested. Increasing the number of nodes for TiDB could influence its performance. If data is divided over more nodes, each individual node will need to store less data which should let it respond to queries faster.

The tools used for load testing databases, Sysbench and Siege are only some of the available tools that could have been used for this benchmark. There are other tools that focus on different aspects of the databases and that simulates other workloads. Tongkaw & Tongkaw (2016) for example, utilized OLTP-bench in addition to Sysbench but both simulate OLTP workloads. Other benchmarking tools could load databases differently by using other types of queries or datasets that could produce different results.

The results from the Sysbench and Siege load testing start to show better scaling for TiDB on larger databases and greater load levels. However, with an exception for the Sysbench read-only tests, the load and table-size is often not large enough for TiDB to actually pass MariaDB in average response time. A larger study that tested with higher load and larger datasets would better explain exactly for what use cases TiDB is best suited for and for what scales it is not worth using.

The complex database queries that were used in this benchmark was able to give an idea of how the database performs for larger queries. However, this study only looked at 2 slow queries. Testing the databases with additional queries with a wider range of complexity could give an even better idea of how each database handles complexity.

Kubernetes was used for setting up the databases and because both databases are set up in the same environment their performance should be affected the same. However, TiDB’s documentation specifies many possible environments that could be used for setting up the database and Kubernetes is just one of them. Other environments could affect the databases in other ways by having a different impact on performance.

31 Kubernetes does not allow for limiting the CPU in a way that would be fair for both databases, therefore it would be interesting to explore how the databases perform with the same CPU restrictions. Other environments or different setups that allow for this type of resource allocation could give results that better reflect the performance with equal circumstances for both databases.

32 7 References Drake, M. (2019). Understanding​ Database Sharding. ​ https://www.digitalocean.com/community/tutorials/understanding-database-sharding [2020-04-24]

Elmasri, R. & Navathe, S. B. (2011). Fundamentals​ of Database Systems. 6th. Edition. ​ Addison-Wesley.

Facebook Database Engineering Team (2020). RocksDB​ Documentation. https://rocksdb.org/docs/ ​[2020-05-01]

Fatima, H. & Wasnik, K. (2016). Comparison​ of SQL, NoSQL and NewSQL Databases for Internet of Things.​ 2016​ IEEE Bombay Section Symposium (IBSS), Baramati, India, 21-22 ​ Dec 2016.

Fulmer, J. (2020). Siege.​ https://github.com/JoeDog/siege/ ​[2020-03-18] ​ Grolinger, K., Higashino, W. A., Tiwari, A. Capretz, M. A. (2013).​ Data​ management in cloud environments: NoSQL and NewSQL data stores.​ Journal​ of Cloud Computing 2,​ 22. ​ https://doi.org/10.1186/2192-113X-2-22

Gupta, A., Tyagi, S., Panwar, N., Sachdeva, S. & Saxena, U. (2017). NoSQL​ databases: Critical analysis and comparison.​ 2017​ International Conference on Computing and Communication Technologies for Smart Nation (IC3TSN). Gurgaon, India 12-14 Oct. 2017, ​ pp. 293-299.

Győrödi, C., Győrödi, R., Pecherle, G. & Olah, A. (2015). A​ comparative study: MongoDB vs. MySQL.​ 2015​ 13th International Conference on Engineering of Modern Electric Systems (EMES). Oradea, Romania 11-12 June 2015, pp. 1-6. ​ Helm Authors (2020). Helm​ documentation. https://helm.sh/docs/​ ​[2020-02-23] ​ Kaur, K. & Sachdeva, M. (2017). Performance​ evaluation of NewSQL databases.​ 2017​ International Conference on Inventive Systems and Control (ICISC). Coimbatore, India 19-20 ​ Jan. 2017, pp. 1-5.

Kopytov, A. (2020). Sysbench​ . https://github.com/akopytov/sysbench​ ​[2020-03-17] ​ Li, Y., & Manoharan, S. (2013). A​ performance comparison of SQL and NoSQL databases.​ 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM). V​ ictoria, BC, Canada 27-29 Aug. 2013, pp. 15-19. ​ ​ Meier, A. & Kaufmann, M. (2019). SQL​ & NoSQL Databases. Wiesbaden: Springer Vieweg. ​ https://doi.org/10.1007/978-3-658-24549-8

33 Michael, M., Moreira, J. E., Shiloach D. & Wisniewski, R. W. Scale-up​ x Scale-out: A Case Study using Nutch/Lucene.​ 2007​ IEEE International Parallel and Distributed Processing Symposium, Rome, 2007, pp. 1-8. ​ Microsoft (2020). Horizontal,​ vertical, and functional data partitioning. https://docs.microsoft.com/en-us/azure/architecture/best-practices/data-partitioning [2020-04-29]

Oracle Corporation (2020). Introduction​ to InnoDB. ​ https://dev.mysql.com/doc/refman/5.6/en/innodb-introduction.html ​[2020-04-27]

Patil, M. M., Hanni, A., Tejeshwar, C. H. & Patil, P. (2017). A​ qualitative analysis of the performance of MongoDB vs MySQL database based on insertion and retriewal operations using a web/android application to explore load balancing — Sharding in MongoDB and its advantages.​ 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC). Palladam, India 10-11 Feb. 2017, pp. 325-330.

Pavlo, A. & Aslett, M. (2016) What's​ Really New with NewSQL?.​ ACM​ SIGMOD Record, ​ 45(2), pp. 45-55.

PingCAP (2020). TiDB​ Introduction. https://pingcap.com/docs/​ ​[2020-02-20] ​ PingCAP (2020). TiDB​ Architecture. https://pingcap.com/docs/stable/architecture/​ ​ [2020-03-20]

Red Hat (2020). What’s​ a Linux container?. ​ https://www.redhat.com/en/topics/containers/whats-a-linux-container ​[2020-04-29]

Solid IT (2020). DB-Engines​ Ranking. https://db-engines.com/en/ranking​ ​[2020-04-29] ​ Stack Exchange Inc (2020). Stack​ Overflow. https://stackoverflow.com/​ ​[2020-04-16] ​ Storti, B. (2017). A​ Primer on Database Replication. https://www.brianstorti.com/replication/​ ​ [2020-04-28]

The Kubernetes Authors (2020). Kubernetes​ documentation. https://kubernetes.io/docs/​ ​ [2020-02-22]

The PostgreSQL Global Development Group (2020). PostgreSQL:​ Documentation. ​ https://www.postgresql.org/docs/8.3/tutorial-transactions.html ​[2020-04-28]

TiKV Project Authors (2020). TiKV​ Documentation. ​https://tikv.org/docs ​[2020-02-20] ​ ​ Tongkaw, S. & Tongkaw, A. (2016). A​ comparison of database performance of MariaDB and MySQL with OLTP workload.​ 2016​ IEEE Conference on Open Systems (ICOS). Langkawi, ​ Malaysia 10-12 Oct. 2016, pp. 117-119.

34 W3Techs (2020). Usage​ statistics of content management systems. https://w3techs.com/technologies/overview/content_management ​[2020-03-15]

Wohlin C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B. & Wesslen, A. (2012). Experimentation in Software Engineering. Springer Science & Business Media. ​

35