Mosql: an Elastic Storage Engine for Mysql

MoSQL: An Elastic Storage Engine for MySQL Alexander Tomic Daniele Sciascia Fernando Pedone University of Lugano University of Lugano University of Lugano Switzerland Switzerland Switzerland ABSTRACT RDBMSs were prohibitive or unnecessary, MySQL has also We present MoSQL, a MySQL storage engine using a trans- been deployed in large and complex environments (e.g., actional distributed key-value store system for atomicity, iso- Wikipedia, Google, Facebook, Twitter). Yet, despite its lation and durability and a B+Tree for indexing purposes. popularity, MySQL is essentially a standalone database server. Despite its popularity, MySQL is still without a general- Multi-server deployments are possible but provide weaker purpose storage engine providing high availability, serializ- system guarantees than single-server configurations (e.g., ability, and elasticity. In addition to detailing MoSQL's de- weak isolation levels, absence of distributed transactions). sign and implementation, we assess its performance with a This is clearly detrimental to the many applications based number of benchmarks which show that MoSQL scales to a on MySQL that have evolved into large and mature systems fairly large number of nodes on-the-fly, that is, additional with originally unexpected scalability, fault tolerance, and nodes can be added to a running instance of the system. performance requirements. The continuing migration of services and applications to the web has exposed RDBMSs to workloads that are larger, Categories and Subject Descriptors growing faster, and behaving more unpredictably. This trend H.2.4 [Information Systems]: Database Management| typically results in over-provisioning in the difficult-to-scale Systems database tier to ensure responsiveness for all expected ranges of client load, while costly and disruptive system upgrades Keywords to meet growing scalability requirements remain a reality. Elastic storage holds the promise of saving over-provisioning Elasticity, Scalability, High Availability, RDBMS, MySQL costs while maintaining high performance and low latency, especially for highly variable or cyclical workloads [2]. 1. INTRODUCTION MoSQL, the distributed storage engine we have designed Relational database management systems (RDBMS) have and implemented, provides near-linear scalability with addi- had remarkable staying power in real-world applications de- tional nodes and strongly consistent, serializable transaction spite many alternative approaches showing promise (e.g., isolation. MoSQL stores data across several storage nodes, XML-based storage, object databases). In recent years, with each storage node containing a subset of the dataset however, the realities of scaling database systems to Internet in memory only. Although each node is responsible for a proportions have made customized solutions more practi- portion of the database, it provides upper layers the ab- cal than general-purpose one-size-fits-all RDBMSs [31]. De- straction of a single-partition system: database entries not spite the disadvantages of using a general-purpose RDBMS stored locally on a storage node are fetched from the re- in comparison to more specific solutions, we expect that a mote nodes responsible for storing them; for performance, significant number of legacy applications will remain in the remotely fetched entries are locally cached. This mecha- years ahead and thus the need to improve the scalability, nism is far more efficient than the one used by standalone performance, and fault-tolerance of RDBMSs is acute. databases, which fetch missing items in the cache from an MySQL is an open-source RDBMS at the core of many on-disk copy of the database. multi-tier applications based on the \LAMP software stack" MoSQL's storage nodes offer simplified concurrency con- (i.e., Linux, Apache, MySQL and PHP). Although the LAMP trol and single-threaded execution; parallelism can be ex- stack initially thrived in environments where the cost, com- ploited by deploying multiple node instances on a single plexity, and capabilities of enterprise-grade frameworks and physical server. Update transactions proceed optimistically: there is no global synchronization of update transactions across nodes during execution (i.e., no distributed locks and deadlocks [14]). At commit time, update transactions are Permission to make digital or hard copies of all or part of this work for certified; the certifier decides which transactions must be personal or classroom use is granted without fee provided that copies are aborted in order to keep the database in a consistent state. not made or distributed for profit or commercial advantage and that copies Read-only transactions always see a consistent snapshot of bear this notice and the full citation on the first page. To copy otherwise, to the database and need not be certified. This mechanism en- republish, to post on servers or to redistribute to lists, requires prior specific ables high performance in large databases where contention permission and/or a fee. SAC’13 March 18–22, 2013, Coimbra, Portugal for the same data is infrequent. Copyright 2013 ACM 978-1-4503-1656-9/13/03 ...$10.00. We have implemented all the features described in the Client MySQL Server Connection Pool paper and conducted a performance evaluation of MoSQL Storage Engine using the TPC-C benchmark. We show that MoSQL is capa- btree_search( ) result ble of scaling TPC-C throughput sublinearly to 16 physical btree_insert( ) servers. With two physical servers, MoSQL is able to sur- MySQL Server Storage node pass the throughput of a single-server instance of MySQL using the standard transactional storage engine InnoDB, while Storage Engine B+Tree layer still maintaining elastic capability and fault tolerance. We get() / put() / commit() commit / abort also demonstrate the elastic capabilities of MoSQL: we add Storage node Transaction clients to a running system until a given latency threshold layer transaction record is passed and then add nodes to the storage tier and launch B+Tree layer read( ) result a new MySQL node and redistribute the clients. Transaction layer The rest of this paper is structured as follows. Section 2 read( ) Storage layer Storage layer details MoSQL's design. Sections 3 and 4 discuss isolation result and performance considerations. Section 5 presents an ex- write( ) perimental evaluation of MoSQL's prototype. Section 6 re- views related work and Section 7 concludes the paper. readsets Certifier writesets (Replicated 2. SYSTEM DESIGN State Machine) Log 2.1 Model and definitions Figure 1: MoSQL global architecture. We consider an environment composed of a set C = fc1; c2; :::g of client nodes and a set S = fs1; :::; sng of requests are either served from local memory or from database server nodes. Nodes communicate through mes- the memory of a remote storage node. This is effec- sage passing and do not have access to a shared memory. tive since retrieving rows from a remote storage node We assume the crash-stop failure model (e.g., no Byzantine is usually faster than retrieving rows from local disk. failures). A node, either client or server, that does not crash is correct, otherwise it is faulty. • The Certifier is a replicated state machine that logs The environment is asynchronous: there is no bound on transactions on disk, ensures serializable transaction message delays and on relative processing speeds. How- executions, and synchronizes system events (i.e., recov- ever, we assume that some system components can be made ery, storage node additions and removals). At storage fault tolerant using state-machine replication, which requires nodes, transactions proceed without synchronization. commands to be executed by every replica (agreement) in At commit time, storage nodes submit transactions to the same order (total order) [17, 25]. Since ordered deliv- the certifier, which ensures serializable execution. ery of commands cannot be implemented in a purely asynchronous environment [6, 12], we assume it is ensured by an Any number of MySQL instances can be connected to \ordering oracle" [17]. any number of storage nodes. Depending on the workload, The isolation property is serializability: every concurrent it may be advantageous to assign multiple storage nodes per execution of committed transactions is equivalent to a se- MySQL instance, or vice-versa. We typically deploy one rial execution involving the same transactions [4]. Serializ- MySQL process together with a small number of storage ability prevents anomalous behaviors, namely, dirty reads, nodes per physical machine, depending on how many cores unrepeatable reads, lost updates and phantoms [15]. are available in the machines. In the following sections, we detail each one of MoSQL 2.2 Overview components. The architecture of MoSQL decouples some of the com- 2.3 Storage nodes ponents typically bundled together in monolithic databases. In particular, concurrency control and logging management Storage nodes are divided into three distinct layers, where are separate from data storage and access methods. In this each layer builds upon the abstraction offered by the layer sense our approach is similar to the model proposed in [19] below. The bottom layer implements a distributed stor- and expands upon the work in [33] and [26]. Figure 1 shows age abstraction. Each storage node is assigned a subset of the architecture of MoSQL. entries, and the storage layer provides operations to read In brief, MoSQL is composed of three main components: and write such entries. This layer abstracts away

Mosql: an Elastic Storage Engine for Mysql

Index Selection for In-Memory Databases

Amazon Aurora Mysql Database Administrator's Handbook

Innodb 1.1 for Mysql 5.5 User's Guide Innodb 1.1 for Mysql 5.5 User's Guide

Mysql Table Doesn T Exist in Engine

Migrating Borland® Database Engine Applications to Dbexpress

A Methodology for Evaluating Relational and Nosql Databases for Small-Scale Storage and Retrieval

Digital Forensic Innodb Database Engine for Employee Performance Appraisal Application

Introduction to Databases Presented by Yun Shen ([email protected]) Research Computing

The Design and Implementation of a Non-Volatile Memory Database Management System Joy Arulraj

Introducing Severalnines Clustercontrol™

Mysql Storage Engines Data in Mysql Is Stored in Files (Or Memory) Using a Variety of Different Techniques

Database Buffer Cache • Relation to Overall System • Storage Techniques in Context