AN EVALUATION OF KEY-VALUE STORES IN

SCIENTIFIC APPLICATIONS

A Thesis Presented to

the Faculty of the Department of Computer Science

University of Houston

In Partial Fulfillment

of the Requirements for the Degree

Master of Science

By

Sonia Shirwadkar

May 2017 AN EVALUATION OF KEY-VALUE STORES IN

SCIENTIFIC APPLICATIONS

Sonia Shirwadkar

APPROVED:

Dr. Edgar Gabriel, Chairman Dept. of Computer Science, University of Houston

Dr. Weidong Shi Dept. of Computer Science, University of Houston

Dr. Dan Price Honors College, University of Houston

Dean, College of Natural Sciences and Mathematics

ii Acknowledgments

“No one who achieves success does so without the help of others. The wise acknowledge

this help with gratitude.” - Alfred North Whitehead

Although, I have a long way to go before I am wise, I would like to take this opportunity to express my deepest gratitude to all the people who have helped me in this journey.

First and foremost, I would like to thank Dr. Gabriel for being a great advisor. I appreciate the time, effort and ideas that you have invested to make my graduate experience productive and stimulating. The joy and enthusiasm you have for research was contagious and motivational for me, even during tough times. You have been an inspiring teacher and mentor and I would like to thank you for the patience, kindness and humor that you have shown. Thank you for guiding me at every step and for the incredible understanding you showed when I came to you with my questions. It has indeed been a privilege working with you.

I would like to thank Dr. Shi and Dr. Price for accepting to be my committee members.

I truly appreciate the time and effort you spent in reviewing my thesis and providing valuable feedback.

A special thanks to my PSTL lab-mates Shweta, Youcef, Tanvir, and Raafat. You have contributed immensely to my personal and professional time at the University of Houston.

The last nine months have been a joy mainly because of the incredible work environment in the lab. Thank you for being great friends and for all the encouragement that you have given me.

A big thanks to Hope Queener and Jason Marsack at the College of Optometry for teaching me the value of team-work and work ethics. I truly enjoyed working with you.

I have been extremely fortunate to have the constant support, guidance, and faith of

iii my friends. A big thank you to all my friends in India, for constantly motivating me to follow my dreams. Thank you for the late-night calls, care packages, and all the love that you have given me in the time that I have been away from home. I would like to thank my friends Omkar, Tejus, Sneha, Sonal, Aditya, and Shweta for being my family away from home. I will forever be grateful for the constant assurance and encouragement that you gave me. I would also like to thank my friends, classmates and roomates here in Houston for all their help and support.

A special thanks to all my teachers. I would not be here if not for the wisdom that you have shared. You have empowered me to chase my dreams. Each one of you has taught me important life lessons that have always guided me. I will be eternally grateful to have been your student.

Last but by no means the least, I would like to thank my family for always being there for me. I would like to start by thanking my Mom and Dad for their unconditional love and support. A very big thank you to Kaka and Kaku for all their love, concern and advice.

You all have taught me the beauty of hard-work and perseverance and this thesis would never have been possible without you. Finally, I would like to thank Parikshit for being my greatest source of motivation. You inspire me everyday to be a better version of myself and I would never have made it without you.

iv AN EVALUATION OF KEY-VALUE STORES IN SCIENTIFIC APPLICATIONS

An Abstract of a Thesis Presented to the Faculty of the Department of Computer Science University of Houston

In Partial Fulfillment of the Requirements for the Degree Master of Science

By Sonia Shirwadkar May 2017

v Abstract

Big data analytics is a rapidly evolving multidisciplinary field that involves the use of com- puting capacity, tools, techniques, and theories to solve scientific and engineering problems. With the big data boom, scientific applications now have to analyze huge volumes of data. NoSQL [1] are gaining popularity for these type of applications due to their scal- ability and flexibility. There are various types of NoSQL databases available in the market today [2], including key-value databases. Key-value databases [3] are the simplest NoSQL databases where every single item is stored as a key-value pair. In-memory key-value stores are specialized key-value databases that maintain data in main memory instead of the disk. Hence, they are well-suited for applications having high-frequencies of alternating read and write cycles. The focus of this thesis is to analyze popular in-memory key-value stores and com- pare their performance. We have performed the comparisons based on parameters like in-memory caching support, supported programming languages, scalability, and utilization from parallel applications. Based on the initial comparisons, we evaluated two key-value stores in detail, namely [4] and [5]. To perform extensive analysis of these two data stores, a of micro-benchmarks have been developed and evaluated for both Memcached and Redis. Tests were performed to evaluate the scalability, responsiveness and data load handling capacity and Redis outperformed Memcached in all test cases.

To further analyze the in-memory caching ability of Redis, we integrated it as a caching layer into an air quality simulation [6] based on Hadoop [7] MapReduce [8] which calculates the eight-hour rolling average of ozone concentration at various sites in Houston, TX. Our aim was to compare the performance of the original air-quality application that uses the disk for data storage, to our application that uses in-memory caching. Initial results show that there is no performance gain achieved by integrating Redis as a caching layer. Further optimizations and configurations of the code is reserved for future work.

vi Contents

1 Introduction 1

1.1 Brief Overview of Key-Value Data Stores ...... 4

1.2 Goals of this Thesis ...... 6

1.3 Organization of this Document ...... 7

2 Background 8

2.1 In-memory Key-value Stores ...... 9

2.1.1 Redis ...... 9

2.1.2 Memcached ...... 12

2.1.3 ...... 15

2.1.4 ...... 17

2.1.5 MICA (Memory-store with Intelligent Concurrent Access) ...... 21

2.1.5.1 Parallel Data Access ...... 21

2.1.5.2 Network Stack ...... 22

2.1.5.3 Key-value Data Structures ...... 23

2.1.6 ...... 24

2.1.7 Comparison of Key-Value Stores ...... 26

2.2 Brief Overview of Message Passing Interface (MPI) ...... 29

vii 2.3 Brief Overview of MapReduce Programming and Hadoop Eco-system . . . . 31

2.3.1 Integration of Key-Value Stores in Hadoop ...... 35

3 Analysis and Results 36

3.1 MPI Micro-benchmark ...... 37

3.1.1 Description of the Micro-benchmark Applications ...... 38

3.1.1.1 Technical Data ...... 41

3.1.2 Comparison of Memcached and Redis using our Micro-benchmark . 41

3.1.2.1 Varying the Number of Client Processes ...... 43

3.1.2.1.1 Using Values of Size 1 KB ...... 43

3.1.2.1.2 Using Values of Size 32 KB ...... 44

3.1.2.2 Varying the Number of Instances ...... 47

3.1.2.3 Varying the Size of the Value ...... 48

3.1.2.4 Observations and Final Conclusions ...... 50

3.2 Air-quality Simulation Application ...... 51

3.3 Integration of Redis in Hadoop ...... 53

3.3.1 Technical Data ...... 55

3.4 Results and Comparison ...... 56

4 Conclusions and Outlook 59

Bibliography 62

viii List of Figures

1.1 Key-value pairs ...... 5

2.1 Redis Cluster ...... 11

2.2 Redis in a Master-Slave Architecture ...... 12

2.3 Memcached Architecture ...... 14

2.4 Riak Ring Architecture ...... 17

2.5 Hazelcast In-memory Computing Architecture ...... 19

2.6 Hazelcast Architecture ...... 20

2.7 MICA Approach ...... 23

2.8 Aerospike Architecture ...... 25

2.9 Word Count Using Hadoop MapReduce ...... 34

3.1 Time Taken to Store and Retrieve Data When the Number of Client Pro-

cesses is Varied...... 44

3.2 Time Taken to Retrieve Data When the Number of Client Processes is Varied. 46

3.3 Time Taken to Store and Retrieve Data When the Number of Servers is

Varied...... 48

3.4 Time Taken to Store and Retrieve Data when the Value Size is Varied. . . . 50

3.5 Customized RecordWriter to Read in Data from Redis ...... 54

ix 3.6 Customized RecordReader to Write Data to Redis ...... 55

3.7 Comparison of Execution Times (in minutes) for Air-quality Applications

Using HDFS and Redis...... 57

x List of Tables

2.1 Summary of features of key-value stores ...... 28

3.1 Time taken to store and retrieve data when number of client processes is

varied...... 43

3.2 Time taken to store and retrieve data when number of client processes is

varied...... 45

3.3 Time taken to store and retrieve data when the number of servers is varied. 47

3.4 Time taken to store and retrieve data when the size of the value is varied. . 49

3.5 Time taken to execute original air-quality application ...... 56

3.6 Time taken to execute the air-quality application using Redis ...... 57

xi Chapter 1

Introduction

Traditionally, science has been divided into theoretical and applied/experimental branches.

Scientific computing (or Computational Science) though closely related to the theoretical side, also has features related to the experimental domain. Computational science has now become the third pillar of science and increasingly scientists employ scientific comput- ing tools and techniques to solve many problems in the fields of science and engineering.

Problems as diverse as designing the wing of an airplane to predicting the weather are being solved using scientific computing methodologies. However, the data generated in such problems is in the range of hundreds of gigabytes, while some applications even deal with terabytes of data. “Big Data” is a term that is generally used to describe such a collection of data which is huge in size and yet growing exponentially with time. The New

York Stock Exchange generates terabytes of new trade data per day. Social media sites like

Facebook ingest and generate around 500+ terabytes of data per day. A single jet engine can generate 10+ terabytes of data in 30 minutes of flight time, with thousands of flights scheduled per day the generation of data reaches up to several petabytes [9]. Many of these

1 applications are either real-time or have requirements to provide results in a timely man- ner. Traditionally, such applications were executed using specialized hardware along with conventional data storage and retrieval methods. However, as the scale of data increased, the need for larger and scalable data storage methods increased which is why large data centers began to be used. The massive datasets collected are so large and complex that none of the traditional data-management tools are able to store it or process it efficiently mainly because they do not scale according to the scale of the data.

Data being produced can be structured, unstructured, or semi-structured. Relational databases are bounded by their schema and hence, pose a limitation on the type of data that can be entered into the . They cannot accommodate the volume, velocity, and variety of the data being produced. Also, the data being collected could not be discarded because larger datasets can be analyzed to generate more accurate correlations, which may lead to more concrete decision-making resulting in greater operational efficiencies and profits. In the early 2000s, the volumes of data being handled by organizations like Google started outgrowing the capacities of the legacy RDBMS software. The exponential growth of the web also contributed to this data explosion and gradually businesses all around began facing the issue of managing increasingly large volumes of data. While Internet giants such as Amazon, , and Google may have been the first to truly struggle with the “big data problem”, enterprises across industries were struggling to manage massive quantities of data, or data entering systems at a high velocity, or both. It wasn’t long before data scientists and engineers designed a new system to meet the increasing data-management demands. As a result, the term “NoSQL” was introduced to describe the data-management systems that contained some RDBMS-like qualities, but went beyond the limits that limited traditional SQL-based databases.

2 A NoSQL-database environment is a non-relational database system optimized for hori- zontal scaling onto a large, distributed network of nodes. It enables rapid, ad-hoc organiza- tion and analysis of massive amounts and diverse data types. NoSQL is a whole new way of thinking about databases. The easiest way to think of NoSQL, is of a database which does not adhere to the traditional relational database management system (RDBMS) structure and sometimes it is also referred to as ‘not only SQL’. It is not built on tables and does not necessarily employ SQL to manipulate data. NoSQL databases also commonly do not provide full ACID (atomicity, consistency, isolation, durability) [10] guarantees. NoSQL also helps ensure availability of data even in the face of hardware failures. If one or more database servers, or nodes goes down, the other nodes in the system are able to continue with operations without data loss, thereby showing true fault tolerance. When deployed properly, NoSQL databases enable high performance while also guaranteeing availability.

This is immensely beneficial because system updates, modifications, and maintenance can be carried out without having to take the database offline. As NoSQL databases do not strictly adhere to the ACID properties, they provide real location independence. This means that read and write operations to a database can be performed regardless of where that I/O operation physically occurs with the operation being propagated out from that location, so that its available to users and machines at other sites. Such functionality is very difficult to architect for relational databases. NoSQL databases guarantee eventual consistency of the data across all nodes.

A NoSQL-data model can support use cases that don’t fit well into a RDBMS. A NoSQL database is able to accept all types of data (structured, semi-structured, or unstructured) much more easily than a relational database, which rely on a predefined schema. NoSQL systems are designed so that they can be easily integrated into new cloud-computing archi- tectures that have emerged over the past decade to allow massive computations to be run

3 inexpensively and efficiently. Data organized in NoSQL systems can be analyzed to gain insights about previously unknown patterns and trends with minimal coding and without the need for data scientists and additional infrastructure. This makes operational big-data workloads much easier to manage, cheaper, and faster to implement. Each organization had different requirements from their NoSQL database and as a result there are various

NoSQL-data stores in the market from different vendors including Amazon, Google, etc., to handle big data. However NoSQL databases can be broadly categorized as follows:

• Key-value store: In a key-value store, the data consists of an indexed key and a value,

hence the name.

• Document database: Expands on the basic idea of key-value stores where “docu-

ments” contain more complex data and each document is assigned a unique key.

• Column store: Instead of storing data in rows as done by RDBMS, these databases

are designed for storing data tables as sections of columns of data, rather than as

rows of data.

: Based on graph theory, these databases are designed for data whose

relations are well-represented as a graph and have elements which are interconnected.

This thesis focuses on key-value stores and on in-memory key-value stores in particular.

1.1 Brief Overview of Key-Value Data Stores

A key-value store is a simple database that uses an (map or dictionary) as the fundamental data structure in which each key is associated with one value. In each key-value pair the key can be in the form of a string such as a filename, URI, or hash while the value on the other hand, can be any kind of data. The value is stored as a

4 BLOB (Binary Large OBject). The value essentially is binary data and can be anything ranging from numbers, strings, counters, JSON, XML, HTML, binaries, images, and short videos. As a result, key-value stores require minimal upfront database design and are faster to deploy. Also, since data is referenced by keys, there is no need to index the data to improve performance. However, since the type of the values is known, you cannot filter or control what’s returned from a request based on the value. Key-value stores provide a way to store, retrieve, and update data using get, put, and delete commands. The simplicity of this model makes a key-value store fast, easy to use, scalable, portable, and flexible. Figure

1.1 shows a collection of keys and the values associated with them. These key-value pairs are then ultimately stored in a key-value database configured to store and retrieve data in an efficient manner.

Figure 1.1: Key-value pairs

As seen in Figure 1.1, the data is stored in the form of key-value pairs. The key needs to be unique throughout the dataset since it serves as the index for the value into the datastore.

Key-value databases are designed so as to enable efficient storage and retrieval of key-value pairs. Typically key-value datastores are implemented using hash-tables since retrieval from hash-tables can be done in O(1) time if the hash-table is implemented properly. Key- value stores can use consistency models ranging from eventual consistency to serializability.

Some maintain the data in memory (known as in-memory key-value stores) while others

5 employ storage devices to maintain the data. There are many types of key-value stores available today but this paper focuses on in-memory key-value stores and specifically on two in-memory key-value databases namely Memcached and Redis.

1.2 Goals of this Thesis

Over the years, traditional databases have been the go-to solution for all data storage and analysis requirements. Although traditional databases have been the tried and tested way to store data, in recent years, we have seen a tremendous shift in the status quo and NoSQL databases have emerged as the solution for all “big data” applications. This is because traditional databases are unable to keep up with the “volume, velocity and variety” of the data currently being generated. There are many types of NoSQL databases each designed with a specific purpose and target group in mind. Key-value stores are one such type of

NoSQL databases which have found wide-spread due to their simplicity and their ability to be easily integrated into any environment with minimal efforts. In-memory key-value stores are a special kind of key-value store that retain data in RAM. They are now increasingly being used in enterprise applications to improve application performance by enhancing the speed with which data is written/read. The goal of this thesis is to evaluate and compare the various in-memory key-value stores currently available. This evaluation is done in three phases. In the first phase, we evaluate and compare popular and widely-used in-memory key-value stores available in the market today. In the second phase, we evaluate and compare in detail, the performance of two in-memory key-value stores, namely Memcached and Redis using a micro-benchmark that we have developed using and the OpenMPI library. In the final phase, we integrate Redis into an application performing large-scale data analysis so as to analyze if in-memory key-value stores enhance the performance of the application. The application that we have used is an air-quality simulation developed

6 using Hadoop MapReduce, which analyzes an air-quality dataset of 48.5 GB, containing measurements of pollutants from various sensors spread all over Texas. This application analyzes the given dataset to calculate the eight-hour rolling average of air-quality in sites across Houston, TX.

1.3 Organization of this Document

The rest of the thesis is organized as follows. Chapter 2 discusses the details of various widely-used in-memory key-value stores. It also outlines the details of the OpenMPI library and the Hadoop framework. In Chapter 3, we describe in detail, the use of OpenMPI micro- benchmark which we evaluate and compare the performance of Memcached and Redis. We also discuss the details of the Hadoop MapReduce air-quality simulation application and evaluate the performance results after integrating Redis into this application. In Chapter

4, we present the conclusion of the work.

7 Chapter 2

Background

In the previous chapter, we discussed briefly the limitations of traditional relational databases and how they fall short when dealing with huge volumes of data. Relational databases offer many powerful data management tools and techniques. However, a majority of applica- tions today, only require basic functionalities to store and retrieve data by primary key and do not require the complex querying and management features offered by RDBMS’s. En- terprise level relational databases require sophisticated hardware and trained professionals for day-to-day operations which increases the cost of maintaining applications using these databases. Also, the available replication strategies are limited and typically choose con- sistency over availability. Despite improvements being made, it is still difficult to scale-out databases or use smart partitioning schemes for load balancing. To overcome the limita- tions discussed earlier, NoSQL databases were proposed as the solution. Various types of

NoSQL databases are now increasingly being used for large-scale data analytics applica- tions. Key-value stores are one such type of NoSQL databases which are widely used in production environments for their performance and simplicity. In-memory key-value stores are a specialized form of key-value stores, and they will be the main focus of this thesis. In

8 this chapter, we describe in detail some widely used in-memory key-value stores available in the market today. We will also briefly explain the OpenMPI library and the Hadoop framework using which we have developed benchmarks and applications for evaluation and results.

2.1 In-memory Key-value Stores

Key-value stores are the simplest form of NoSQL databases. A key-value store allows you to store data, indexed by unique keys. The value is just a blob and the database is usually not concerned about the content or type of the value. In other words, key-value stores don’t have a type-defined schema, but a client-defined semantics for understanding what the values are. Key-value stores tend to have great performance, because the access pattern in key-value stores can be optimized. The benefits of using this approach is that it is very simple to build a key value store. Also, applications using key-value stores are easily scalable. In-memory key-value stores are a specialized form of key-value stores which are highly optimized so as to allow extremely fast read/writes from/to the database. In- memory key-value databases store the data in main memory (RAM), so that any request to read/write data can be serviced by just accessing the RAM instead of the disk. It is because of this reason that such key-value stores are now increasingly being incorporated as caching layers in time-sensitive data analytics applications. In the following subsections we describe and compare the details of some widely used in-memory key-value stores.

2.1.1 Redis

Redis [5] is a very popular open-source (BSD licensed), in-memory, key-value data store.

Redis is widely used because of it’s great performance, adaptability, a rich set of data structures, and a simple API. According to the creator, Salvatore Sanfilippo, Redis is an

9 “in-memory data structure store used as a database, cache, and ” [5]. This is because, Redis provides support for storing not only string values but also complex data structures like hashes, lists, and sets. Redis also has support for replication, LUA scripting

[11], Least Recently Used (LRU) eviction [12] and different levels of on-disk persistence.

On-disk persistence means that apart from maintaining data in the memory, there is an option to also persist the data by either dumping the data to the disk periodically or by appending each write command to a log file. Persistence can be optionally disabled, if the application just requires a high-performance, in-memory, caching mechanism.

The architecture of any Redis application is simple and consists of two main processes -

Redis client and Redis Server. The client and server processes can be in the same computer or in two different computers. The server is responsible for storing data in memory and handling all read/write requests from the client. The client can be the Redis console client

(provided by RedisLabs) or any other application developed using Redis-client libraries

(available for a wide variety of programming languages). For trivial applications, which require basic caching facilities, one instance of a Redis server will suffice. However, most production level applications, will require more than one instance of a Redis server. “Redis

Cluster” (available since version 3.0) is a fairly new feature of Redis which involves running multiple instances of the Redis server on machines in the cluster. The basic structure of

Redis deployed in a cluster is as follows:

10 Figure 2.1: Redis Cluster

In a cluster, all the server instances are connected to each other and together, they maintain meta-data about the state of the network. There may be more than one instances of the server running on one physical machine. The servers communicate using a customized and highly optimized version of the gossip protocol [13]. Client applications connect to these server instances and issue read/write requests. A requesting client application can be of two types:

• Dummy-client requests: The client is responsible for locating the correct node on

which data is located and issue requests to the respective node.

• Smart-client requests: The client forwards its request to any one of the nodes. The re-

quest is then forwarded to the appropriate server where the requested data is present.

The details of how the above connections are created and maintained, are hidden from the end-user applications by Redis-client libraries. Redis can be deployed in cluster in a variety

11 of ways but the most common method is a master-slave sharded method so as to enable replication of data. The logical structure of such an architecture is as follows:

Figure 2.2: Redis in a Master-Slave Architecture

The slave nodes are exact replicas of the master nodes which ensures that the required data will be available even if a particular master node goes down. The Redis cluster manager tries to allocate slaves and masters such that the replicas are in different physical servers. “Redis Cluster” also provides many other useful features like adding and removing nodes while applications are running, resharding of keys around nodes in the cluster, multi- key operations (e.g., using wildcard characters to retrieve key-value pairs). Redis, whether running on a standalone machine or in a cluster greatly improves the performance of applications. Due to a diverse set of useful features and also because of its performance and ease of use, Redis, today, is one of the leading key-value stores being used in the industry and academia.

2.1.2 Memcached

Memcached [4] is an open-source, high-performance, distributed, in-memory key-value store which is used as a caching layer in many applications that deal with huge volumes of data.

12 Memcached is used for data caching in LiveJournal, Slashdot, , and other high- traffic sites [14]. According to Brad Fitzpatrick, the creator of Memcached, the primary motivation for creating Memcached was to improve the load performance of dynamic web- sites by caching individual objects on dynamic web pages. The main idea behind Mem- cached is to collect the main memory available in all machines connected in a network, and pool them together so that their collective main memory capacities appear as one cohe- sive unit to applications using Memcached. This means that Memcached does not require extremely powerful servers to execute. Memcached can be run on commodity hardware, connected together in a network. Nodes can be added/removed from the network with- out any adverse effects. Also, the effective total amount of RAM made available to client applications is more and can easily be modified to suit application requirements.

Memcached is designed to have a client-server architecture. Memcached server instances are run over nodes in the network, wherever memory is available, and each server listens on a user-defined IP and port. The main memory from all running Memcached server instances forms a single, common memory pool, and client applications use this memory pool to store and retrieve data. Multiple Memcached server instances can run on a single physical machine. The basic structure of Memcached in action in the network is as follows:

13 Figure 2.3: Memcached Architecture

In Figure 2.3 [4], we have three Memcached server instances running in the cluster.

The keyspace is divided among the server instances such that each Memcached server is responsible for a particular set of key-value pairs. To store/retrive a key-value pair, client applications are supposed to send requests to the correct Memcached instance. This is done by logically considering each Memcached server itself to be a bucket in a .

To store/retrieve a key, the client calculates the hash of the key, which points to the correct

Memcached instance. Each Memcached instance, in turn, holds a hash table of it’s assigned key-value pairs. The client application can then store/retrieve the key-value pair. Thus,

Memcached acts as a two-layer global hash table. End-users need not be worried about the details about how to connect to the correct Memcached instances. There are many

Memcached client libraries available in a wide variety of languages like C, C++, , Java,

PHP, Ruby, and Python. These libraries abstract away the internal details and present a simplified API which can then be used by applications. In Figure 2.3, if the application requests the key ‘foo’ (using a client library), the client library calculates the hash value of the key to locate the server which will process the request (in this case, ‘foo’ is present in

14 server 2). The request is then forwarded to the correct server (server 2). The server then responds to the client library by searching for ‘foo’ in it’s local hash table and returning it to the user.

Each server instance is independent of the other and they do not communicate with each other. Also, the data inside the servers is maintained on a least recently used basis to make room for new items. In case a server fails or if the requested data is not present in the cache, requests to the server result in a cache miss, which the application may then handle appropriately. Memcached clients have to be configured appropriately to deal with node failures. If no effort is taken in this direction, requests for keys assigned to a failed

Memcached instance simply result in cache misses. Memcached is designed for fast access to data by using optimized memory allocation algorithms, avoiding locking objects so as to avoid waits, fetching multiple keys at the same time etc. Due to it’s compactness, simplicity and high-performance, Memcached is widely used as a caching layer in many applications that require high-speed access to data [15].

2.1.3 Riak

Riak key-value store (known as Riak KV) [16] is a highly resilient key-value database. It is highly optimized to be available and scalable while running on a cluster of commodity hardware. Riak also provides in-memory caching by integrating Redis as the caching layer into it’s key-value database. This helps reduce latency and improves application performance. Riak stores data as a combination of keys and values, where the value can be anything ranging from JSON, XML, HTML to binaries, images etc. Keys are binary values which are used to uniquely identify a value. An application using Riak is part of a client-server request-response model. Client applications are responsible for connecting to a

Riak server and making read or write requests. User applications wanting to leverage Riak,

15 need not delve into the details of how to communicate with Riak servers. They can simply make use of simple API’s provided by client libraries, available for many programming languages like Java, Ruby, Python, PHP, Erlang, .NET, Node.js, C, , Go, Perl,

Scala, and . A Riak server is responsible for satisfying incoming client requests and can function as a stand-alone instance or can be grouped together to form a Riak cluster. All the Riak instances in a cluster work together, by pooling together their individual hardware resources to provide a global view of the database to client applications. They communicate with each other to provide data availability and partition tolerance.

Riak, working in a cluster, has a peer-to-peer architecture in which all the nodes can fulfill read and write requests. All nodes have the same set of functionalities which is why there is no single point of failure in the architecture. Riak’s architecture is arranged in the form of a “Ring”. Nodes in the cluster are assigned logical partitions and these partitions are all considered as part of the same hash space (In Figure 2.4 [16], node 0 is responsible for all green partitions while all orange partitions are handled by node 1 and so on). Each partition is a logical entity that is managed by a separate process. This process is responsible for storing data, serving incoming read, and write requests. Since the workload is distributed among multiple processes, Riak is extremely scalable. A physical machine in the network may have one or more partitions stored locally. Depending on the replication factor (say N), replicas of data stored in one partition is also stored in the “next N partitions” of the hash space. Nodes in the cluster communicate with each other by exchanging a data structure known as “Ring state”. At any given point of time each node in the cluster knows the state of the entire cluster. A client can request for a particular piece of information from any node in the cluster. If a node receives a request for data that is not present locally, it forwards the request to the proper node by consulting the ring state. The ring architecture explained above can be logically depicted as follows:

16 Figure 2.4: Riak Ring Architecture

Riak is an eventually consistent database [17] which means that data is evenly dis- tributed among all nodes in the cluster and that if a node goes down, key-value pairs are redistributed in an efficient manner. When a particular node goes down, a neighboring node will take over its responsibilities. When the failed node returns, the updates received by the neighboring node are handed back to it. This ensures that data is always available.

Riak also guarantees eventually consistent replicas of the data, meaning that while data is always available, not all replicas may have the most recent update at the exact same time. Due to it’s simple architecture, high-performance, and well-documented client li- brary API’s, Riak has found wide-spread use in many corporations like Uber, Alert Logic,

Zephyr and Rovio.

2.1.4 Hazelcast

Hazelcast [18] is an open source, in-memory data store written in Java. According to the documentation, “Hazelcast is an In-Memory Data Grid (IMDG) and allows for data to be evenly distributed among the nodes of a computer cluster and is designed to scale up

17 to hundreds of thousands of nodes”. While in-memory key-value stores like Redis started providing cluster support only after a few initial versions, Hazelcast was developed from the ground up with the intention to leverage distributed computer architectures.

Hazelcast’s architecture can be described as peer-to-peer. There is no master and slave and hence there is no single point of failure. All nodes store an equal amount of data and do an equal amount of processing. The oldest node in the cluster is the de-facto leader and manages cluster membership by determining which node is responsible for which particular chunk of data. As new nodes join or dropout, the cluster re-balances accordingly. Each server instance runs in a separate Java Virtual Machine [19] and there may be more than one server instances running on a single physical machine. Hazelcast supports a client- server request-response design. Client applications making data requests are serviced by

Hazelcast server instances running on nodes in the cluster. User applications do not need to delve into the details of connecting to Hazelcast servers and making requests. There are a wide variety of client libraries that enable user applications to communicate with Hazelcast instances distributed on nodes in the network. Client libararies are provided for popular programming languages like Java, C++, .NET, Node.js, Python, and Scala. Figure 2.5

[20] depicts the communication mechanism between the client and server applications.

18 Figure 2.5: Hazelcast In-memory Computing Architecture

The client application makes requests to the Hazelcast server which is then fulfilled.

The communication pattern between the client and the servers can one of the following:

• Embedded topology

The client application, the data and the Hazelcast instance all reside on the same

node and share a single JVM. The client and the server communicate with each other

directly.

• Client plus member topology

The client application and the Hazelcast instances are not tightly coupled and may

reside on different nodes of the cluster. They communicate with each other over the

network.

19 The two topologies listed above are depicted below [20].

(a) Embedded Topology (b) Client plus Member Topology

Figure 2.6: Hazelcast Architecture

Although the embedded topology is comparatively simple and there are no extra nodes to manage or maintain, the client plus member topology is mostly preferred. This is be- cause, it provides greater flexibility in terms of cluster mechanics. Member JVMs can be taken down and restarted without affecting the application. The client plus member topologies isolate the application code from cluster-related events. Hazelcast client ap- plications can be either a “native client” or a “lite client”. A native client maintains a connection to any one node in the cluster and is redirected appropriately by that node when making requests. A lite client maintains data about each and every cluster in the node and makes requests to the correct Hazelcast instance. The Hazelcast instances share the keyspace such that any one instance is not over-burdened. In case of node crashes,

Hazelcast also provides recovery and fail-over capabilities. Hazelcast is an open source library which is easily distributed in the form of a JAR file without the need to install any software. It supports in-built data structures like maps, queues, multimaps and also

20 allows for the creation of custom data structures. Hazelcast is used in many enterprise applications and has a huge client base that includes American Express, Deutsche Bank,

Dominos Pizza and JC Penny.

2.1.5 MICA (Memory-store with Intelligent Concurrent Access)

MICA [21] is “a scalable in-memory key-value store that handles 65.6 to 76.9 million key- value operations per second using a single general-purpose multi-core system” [21]. MICA can be integrated into applications using a request-response, client-server model. MICA is installed across nodes in the cluster and client applications can connect to these instances to make requests. The requesting client needs to know which server instance to contact. To serve multiple client requests efficiently, MICA is designed for high single-node throughput and low end-to-end latency. MICA also strives to achieve consistent performance across workloads, and can handle small, and variable-length key-value items while still running on commodity hardware. To achieve all the above performance gains, MICA makes key design decisions regarding parallel data access, the network stack, and key-value data structures.

The following sub-sections describe these design choices in detail.

2.1.5.1 Parallel Data Access

To enable truly parallel access to data, MICA creates one or more data partitions (“shards”) per CPU core and stores key-value items in a partition determined by their key. An item’s partition is determined by using a 64-bit hash of an items key calculated by the client application. Sometimes, such partitioning may lead to skewed workloads wherein a particular partition is being used more often than others. In this case, MICA exploits CPU caches and packet burst I/O to disproportionately speed more loaded partitions, nearly eliminating the penalty from skewed workloads. MICA can operate in EREW (Exclusive

21 Read Exclusive Write) or CREW (Concurrent Read Exclusive Write) modes. EREW assigns a single CPU core to each partition for all operations. The absence of concurrent access to partitions removes the need for synchronization and inter-core communication, making MICA scale linearly with CPU cores. CREW allows any core to read partitions, but only a single core can write. This combines the benefit of concurrent read and exclusive write; the former allows all cores to process read requests, while the latter still reduces expensive cache-line transfer.

2.1.5.2 Network Stack

MICA uses Intels DPDK [22] instead of standard socket I/O. This allows our user-level server software to control NIC’s (Network Interface Card) and transfer packet data with minimal overhead. This is done because the key-value pairs to be sent over the network are usually not large enough as compared to traditional TCP/IP packets. Also, TCP/IP features like congestion control and error correction are strictly not required for this articula case. By bypassing socket I/O, MICA avoids any additional network features that are not required and hence avoids delays. For NUMA (non-uniform memory access) systems [23], the data is partitioned such that the CPU core and the NIC only accesses packet buffers stored in their respective NUMA domains. Each key-value pair to be transmitted is an individual packet, to further increase transmission speeds, MICA uses bursty I/O. MICA also ensures that no CPU core is overloaded with requests by using processor affinity to determine which CPU is responsible for which partition of data. Requests for keys are then forwarded accordingly by the client.

22 2.1.5.3 Key-value Data Structures

MICA can be used either for storing data (no existing items can be removed without an explicit client request) or for caching data (existing items may be removed to reclaim space for new items). MICA uses separate memory allocators for cache and store semantics.

MICA uses a circular log for caching. New data is appended to the log and existing data is modified in place. Oldest items at the head of the file are evicted to make space for newer entries when the cache is full. Although the natural eviction is FIFO, MICA can provide LRU eviction by reinserting any requested items at the tail. In store mode, MICA uses a lossy concurrent hash index to index stored items. Both the above data structures exploit cache semantics to provide fast writes and simple memory management. Each

MICA partition consists of a single circular log and lossy concurrent hash index.

Figure 2.7 [24] clearly depicts MICA’s in-memory key-value store approach. It also shows how a client request is forwarded to the server and how each design decision discussed above affects the plays a part in enhancing the performance.

Figure 2.7: MICA Approach

MICA is entirely written using the C and it has a client library

23 in C. Applications that want to leverage MICA as a key-value store or cache can use this client library to make requests to MICA instances installed on a cluster. Although, MICA has a set of impressive features, it is not as widely used as it’s other counterparts. The reasons for this include limited documentation as well as lack of client libraries in other programming languages.

2.1.6 Aerospike

Aerospike is a distributed, scalable NoSQL database. It is developed from the ground up keeping clustering and persistence in mind. It’s architecture is comprised of the following layers [25]:

• Application layer

All end-user applications fall in this layer

• Client layer

This layer consists of a set of client libraries written in a variety of languages like

C, Java, C#. NET, Go, Perl and Python. These client libraries are responsible for

monitoring the cluster on which Aerospike is installed and forwarding application

requests to the correct node.

• Clustering and distribution layer

This layer manages cluster communications and automates fail-over, replication,

cross-data center synchronization, and intelligent re-balancing and data migration.

• Data storage layer

This layer reliably stores data in DRAM and Flash for fast retrieval.

24 Figure 2.8: Aerospike Architecture

Aerospike uses a shared-nothing architecture, where every node in the Aerospike cluster is identical, all nodes are peers and there is no single point of failure. Data is distributed evenly and randomly across all nodes within the cluster. Nodes within the cluster com- municate with each other using a “heartbeat call” to monitor inter-node connectivity and to maintain meta-data about the cluster state. When a node is added or removed from the cluster, data is automatically redistributed among the nodes. Aerospike also allows for replication of data so as to ensure reliability and availability even if a node goes down.

Replication is done on geographically separated nodes so as to ensure maximum availabil- ity. Any changes to the main data partition is also immediately reflected in the replicas.

On cluster startup, Aerospike configures policy containers -namespaces (similar to RDBMS databases). Namespaces are divided into sets (similar to RDBMS tables) and records (sim- ilar to RDBMS rows). Each record has a unique indexed key, and one or more bins (similar to RDBMS columns) that contain the record values. Applications can read or write this data by making requests using Aerospike client libraries. When data is to be stored, the client library computes a hash to determine which node the data is to be stored on and

25 forwards the request accordingly. Similarly, to read a particular key-value pair, the hash of the key is calculated by the client library and the request is forwarded to the that node accordingly. If a node goes down, the client libraries communicate with the replicas until the node comes back up again. Aerospike secondary indices of data in memory for faster retrieval. One major feature of Aerospike is that the data can be persisted on to SSD (Solid

State Storage) storage. This hybrid model enables faster fetching of data as compared to traditional HDD () storage. Aerospike also supports data types, queries and User Defined Functions (UDF). Aerospike has steadily gained recognition for being a high-performing, scalable key-value store and is being used by organizations like Kayak,

AppNexus, Adform and Yashi.

2.1.7 Comparison of Key-Value Stores

In the previous sections, we briefly described the salient features of some widely used in- memory key-value stores. In this section, we will compare them so as to pick the ones that we would like to further analyze. The comparison is done on the following factors:

• Programming languages

The aim is to select a database which has client libraries in widely-used major pro-

gramming languages. This ensures that the key-value store can be easily integrated

into scientific and big-data applications.

• Hadoop and HPC support

We want to select a database which can be easily integrated into Hadoop and High

Performance Computing environments (in our case we aim for Open MPI support).

This is because we will be analyzing the key-value store using an Open MPI micro-

benchmark and a Hadoop application.

26 • In-memory storage

Our aim is to analyze key-value stores which can be integrated as a caching layer in

compute intensive applications to see if we observe any performance benefits. Hence,

we look for a key-value store that maintains data in memory.

• Storage on files or databases

We also would ideally like the key-value database to persist data onto secondary

storage so that data is not lost.

• Access from remote locations

We plan to install the key-value store onto a cluster and then access the database

remotely using client applications, which is why easy remote access is important for

us.

• Support for parallel storage and operations

Ideally, we want data operations to be performed in parallel. The key-value store

should be able to run in a cluster and should be able to process multiple incoming

simultaneous requests. Unrelated data requests should not block operations and data

operations should be performed as soon as possible.

• Open Source

From a financial perspective, we aim to select key-value stores that are open source.

Table 2.1 gives a summary of the relevant features of all the key-value stores discussed above:

27 Table 2.1: Summary of features of key-value stores

Comparing the features of all the above in-memory key-value stores, we found Redis to be the best fit. Riak fulfills all of the above requirements but its in-memory key-value store internally uses Redis, so we decided on not moving forward with it. Similarly, Aerospike also has some promising features but it requires a Solid State Drives (SSD) as the backing store, which, we believe, largely restricts its scope. Hazelcast and MICA do not have the option to back data onto a secondary storage medium, which is why we did not select them. Although, Memcached too does not allow backing of data onto secondary storage, based on surveys [26], we observed that Redis and Memcached are the most widely-used

28 key-value stores. Hence, we decided to select Redis and Memcached for further analysis.

In the next sections, we examine details of the Message Passing Interface (MPI) used in parallel computing and the Hadoop framework used for analysis mostly done using a cluster of commodity hardware. We will also examine the ways in which in-memory key- value stores can possibly be integrated into these environments so as to offer performance improvements.

2.2 Brief Overview of Message Passing Interface (MPI)

Traditionally computer problems were solved using serial algorithms where instructions were executed one after the other. In parallel computing, a problem is broken down into discrete parts that can be executed concurrently by compute resources that communicate and co-ordinate with each other to produce the desired results. Parallel computing is thus used to either solve problems that are too large to be solved by a single compute resource or to solve problems faster than a single compute resource. The compute resources can be either a single computer with multiple cores or a set of computers connected through a network. If the compute resource is a single multi-core computer, then communication is done by reading or writing to shared memory. However for a distributed architecture, communication is done using sockets, message passing, or Remote Procedure Calls (RPC).

Generally, shared memory systems are easy to program while distributed memory systems are difficult to program. This is largely because of the inherent complexity of designing and coordinating concurrent tasks, a lack of portable algorithms, standardized environments, and software development toolkits. There are constant innovations in microprocessor ar- chitecture and as a result, parallel software developed keeping a particular architecture in mind soon becomes outdated, which ultimately undermines the efforts taken to design that particular parallel software. Hence, there is a need for a standard library that enables

29 to develop portable, high-performance, parallel applications. MPI stands for

Message Passing Interface [27] and it is a standard that is created and maintained by the

MPI Forum, an open group consisting of parallel computing experts from the industry as well as academia. The MPI standard provides an Application Programming Interface

(API) [28] that is used for portable, high-performance inter-process communication (IPC)

[29] message passing.

On most operating systems, an “MPI process” usually corresponds to the ’s concept of a process and processes working together to solve a particular problem are part of a group so as to enable communication between them. MPI is designed to be actualized as middleware, meaning that upper-level applications invoke MPI functions to perform message passing without actually going into the details of how exactly communica- tion takes place. MPI defines a high-level API and it abstracts away the actual underlying communication methods used to transfer messages between processes. This abstraction is done to hide the complexity of inter-process communication from the upper-level applica- tion and also to make the application portable across different environments. A properly written MPI application is meant to be source-compatible across a wide variety of plat- forms and network types. MPI exposes API’s for point-to-point communication (e.g., send and receive) and also for other communication patterns, such as collective communication.

A collective operation is an operation where multiple processes are involved in a single communication. Reliable broadcast which involves one MPI process sending a message to all other MPI processes in the group is an example of a collecive operation. There are many implementations of the MPI standard targeted for a wide variety of platforms, operating systems, and network types. Some implementations are open source while others are closed source. Open MPI, as its name implies, is an open source implementation of MPI and is widely used in many high-performance computing environments. We have developed a

30 micro-benchmark using OpenMPI to analyze the performance of Redis and Memcached.

The details of this benchmark are given in the next chapter.

2.3 Brief Overview of MapReduce Programming and Hadoop

Eco-system

Lately, there has been a deluge of data that is huge and varied. Traditional data analysis tools are not equipped to handle the magnitude and variety of data being generated and that is where Hadoop [7] comes in. “The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers us- ing simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.” [30]. In Hadoop, data storage and data analysis, both are performed using the same set of nodes which allows Hadoop to improve the per- formance of large scale computations by using the principle of spatial locality [31]. Also, the cost of a Hadoop cluster is extremely cheap due to the use of commodity hardware.

Together Hadoop-based frameworks have become the de-facto standard for storing and processing big data.

The Hadoop framework consists of three main components:

• HDFS: Hadoop Distributed File System (HDFS) [30] is a distributed file system

which is used to store very large files.

• MapReduce Framework: The MapReduce [30] module is responsible for carrying out

distributed analysis tasks by implementing the MapReduce paradigm.

31 • YARN: Yet Another Resource Manager (YARN) [32] is the resource manager for the

framework and is responsible for managing and allocation resources to the application

as and when required.

The origins of the Hadoop framework are largely inspired by the Google File System

[33] and MapReduce paradigm [8] introduced in 2004. These concepts laid the foundation for the Hadoop framework and by 2009 Hadoop came to be widely used as a large-scale data-analysis platform. In this model, the total computational requirements of a Hadoop application are divided among nodes in the cluster, and the data to be processed is stored in

HDFS. HDFS divides the file into blocks and stores those blocks onto nodes in the cluster.

HDFS also provides fault tolerance by storing replicas of file chunks in the cluster and the default replica count is three (which may be configured according to the requirements of the application). For fault tolerance, HDFS stores the first replica on the same rack where the original data is present so as to quickly overcome the failure of a node and to continue processing. Another replica is stored on a separate rack so that in the event of a rack failure, data will be available and can be analyzed.

The MapReduce style of programming is exceptionally flexible and can be used to solve a wide-array of data analytics problems. A Hadoop cluster consists of computational nodes which can share workloads and take advantage of a very large aggregate bandwidth across the cluster. Hadoop clusters typically consist of a few master nodes, which control the storage and processing systems in Hadoop, and many slave nodes, which store all the clusters data and is also where the data gets processed. MapReduce involves the processing of a sequence of operations on distributed-data sets. The data consists of key-value pairs, and the computations have only two phases: a map phase and a reduce phase. The key concept here is divide and conquer. A typical MapReduce application will have the following phases:

32 • During the Map phase, input data is split into a large number of fragments, each of

which is assigned to a map task.

• These map tasks are distributed across the cluster.

• Each map task processes the key-value pairs from its assigned fragment and produces

a set of intermediate key-value pairs.

• The intermediate data set is sorted by key, and the sorted data is partitioned into a

number of fragments that matches the number of reduce tasks. This phase is known

as the sort and shuffle phase.

• During the Reduce phase, each reduce task processes the data fragment that was

assigned to it and produces an output key-value pair.

• These reduce tasks are also distributed across the cluster and write their output to

HDFS when finished.

To put this in perspective, we can make use of a basic word-count example. The word count operation takes place in two stages - a mapper phase and a reducer phase. In the mapper phase the input text/document is tokenized into words and a key value pair is formed with these words such that the key is the word itself and the value is ‘1’. All the values corresponding to a key go to one reducer and in the reduce phase the keys are grouped together and the values for similar keys are added. This process can be visualized better as seen in Figure 2.9 [34].

33 Figure 2.9: Word Count Using Hadoop MapReduce

In a MapReduce application, both the map and reduce functions are distributed. When a MapReduce application is launched, many copies of the program are started on the cluster of machines on which it is started. One of the copies is called the master and it controls the rest of the copies - the workers. The master is responsible for distributing the data across the workers and ensuring that all the workers are engaged in successful completion of tasks. In case of any failure, automatic re-scheduling of tasks across the available workers is done. The intermediate key value pairs generated by the map function is distributed across the multiple workers which run the reduce function. The intermediate values are sorted and then merged by the reduce function which emits them as output. This distribution of resources is handled by the YARN module of the Hadoop framework. In the next subsection, we briefly describe our reasoning behind integrating an in-memory key-value store into a Hadoop application and the potential benefits that we may gain.

34 2.3.1 Integration of Key-Value Stores in Hadoop

The input, temporary results and the output of a MapReduce application are read/written from/to the disk via HDFS. Although HDFS is optimized to handle huge loads, the disk will tend to slow down the performance. Although a majority of MapReduce applications are meant to be executed in batch-processing mode, there are some applications that may require quick delivery of intermediate results. Scientific applications fall in this category and hence, this thesis aims to introduce an in-memory key-value store that will act as the primary backing store for MapReduce applications instead of HDFS. This is done with the intention of improving the overall performance of the application by reducing the time to read/write results. To achieve this, we studied, analyzed and compared the features of many key-value stores widely used today. Our aim was to find a key-value store which had the ability to retain data in the main memory so as to reduce retrieval time, support parallel computing and Hadoop applications and which preferably, is also open source.

Some of the key-value stores that we analyzed are discussed below so as to select the ones that most suit our needs.

In the next chapter, we will compare the performance of Memcached and Redis using a micro-benchmark. We will also discuss the working of the air-quality simulation application in detail and about how integrating an in-memory cache into this application can possibly give increased performance benefits.

35 Chapter 3

Analysis and Results

The previous chapter gave an overview of various in-memory key-value stores, OpenMPI and the Hadoop framework. After evaluating some widely used in-memory key-value stores, we were most interested in evaluating the performance of Redis and Memcached in detail.

To perform this analysis, we have developed a micro-benchmark application. Also, we were interested in integrating an in-memory key-value database into a compute intensive application to evaluate if we gain any performance benefits. For this analysis, we have used an air-quality simulation that generates the eight-hourly air-quality average around sites in Houston.

In the initial part of this chapter, we describe the micro-benchmark application in detail and present our results and observations. We then describe the air-quality application in detail and present our strategy for incorporating an in-memory cache into a Hadoop application. We then conclude this chapter with our results and findings.

36 3.1 MPI Micro-benchmark

To compare the performance offered by Memcached to that offered by Redis, we have developed two micro-benchmark applications using C and the MPI library, one each for

Memcached and Redis. The main intention behind developing these two micro-benchmarks was to do an initial performance analysis of Memcached and Redis. The micro-benchmark is a C program that establishes a basic communication setup between Memcached/Redis servers running in a cluster and the respective client applications. The micro-benchmarks have been developed so that a user can easily specify configurations using only command line arguments and input files. The parameters that the user can influence are as follows:

• Number of Servers

The number of Memcached/Redis servers to be used and their respective hostnames

are passed in an input text file to the program. These servers then work together to

handle incoming client requests.

• Number of Clients

The number of client processes making requests to the server can be specified us-

ing command-line arguments. The number of client processes storing data can be

configured separately from the number of clients retrieving data.

• Number of key-value pairs to be stored and retrieved

The total number of key-value pairs to be stored and retrieved can be indicated using

command-line arguments.

• Individual value size

The size of individual values can be specified using command-line arguments.

In our analysis, the main conditions that we want to evaluate is the scalability, reliability,

37 and load-balancing ability of Memcached and Redis. By varying input parameters to the benchmarks, we have evaluated and compared both Memcached and Redis to test for the above conditions. In the next section, we present details of the micro-benchmark application and our findings.

3.1.1 Description of the Micro-benchmark Applications

Although, we have developed two micro-benchmark applications, one each for Memcached and Redis, the two are very similar and only differ in parts that require communication and synchronization with either Memcached or Redis. We now give details of the Mem- cached benchmark application and later on, we will explain the Redis benchmark by only explaining the sections of code that differ.

In the previous chapter, we explained that Memcached client libraries can be integrated into user applications to make requests to the server to store, retrieve or modify a particu- lar key-value pair. Memcached has a variety of client libraries for programming languages like C, C++, Java, or C# .NET. Since we are using MPI and the C programming language for our benchmark application, we have used libMemcached as our client library. libMem- cached is an open source C/C++ client library for the Memcached server which has been designed to be light on memory usage, thread safe, and provide full access to server side methods. Our MPI micro-benchmark applications make requests to Memcached server instances, with the help of API’s exposed by libMemcached. The cluster that we have used for our evaluations is the crill cluster at the University of Houston and the details of this cluster are provided later on in this chapter.

Our MPI benchmark application acts as a client and sends requests to Memcached servers.

We initially start out with validating the input parameters and initializing the MPI en- vironment. Once everything has been set-up, we establish a connection to the required

38 number of Memcached servers by using host-names from a given input text file. In the fol- lowing sample code, each line from the input file is fetched and interpreted as a host-name with which a connection is to be established. while((readLen = getline(&line, &length, fp)) != -1)

{

line[readLen - 1] =’\0’;

servers = memcached_server_list_append(servers, line, 11211, &rc);

rc = memcached_server_push(memc, servers);

}

Once the connections have been established, key-value pairs are stored onto the Memcached servers. Depending on the number of instances of the client application to be executed and the number of key-value pairs to be stored/retrieved, the keyspace is divided equally among the MPI processes. Each MPI process is responsible for handling it’s subset of the keyspace, independent of the other MPI processes. For example, if 4 MPI processes are given the task of storing 20 key-value pairs, each process will generate and store 5 key-value pairs onto Memcached servers. Out of these 4 MPI processes, if only 2 processes are given the responsibility to retrieve key-value pairs, then each retrieving client will be responsible for fetching 10 key-value pairs. Special care has been taken to avoid duplicate keys in the dataset by using a combination of the current MPI process’ rank and offset of the current key within the subset of data assigned to the current instance. Values are just alpha-numerical strings that are generated using a random function. These key-value pairs are then later retrieved one by one and the amount of time taken to store and retrieve the key-value pairs is noted down. Between the generation, storing of key-value pairs and their retrieval, care has been taken to insert MPI barrier statements because we are pipelining the storage and retrieval tasks one after the other. The following code section demonstrates

39 the relevant code section to retrieve key-value pairs from Memcached servers.

MPI_Comm_rank(MPI_COMM_WORLD,&taskid);

MPI_Comm_size(MPI_COMM_WORLD, &numtasks); nKeyValPairs = atoi(argv[2]); nSubsetSize = nKeyValPairs / numtasks; keyMin = taskid * nSubsetSize; keyMax = ((taskid + 1) * nSubsetSize) - 1; start = MPI_Wtime(); while(keyMin <= keyMax)

{

sprintf(key,"%", keyMin);

gen_random(value, valueSize);

rc = memcached_set(memc, key, strlen(key), value, strlen(value), (time_t)0,

(uint32_t)0);

keyMin++;

} end = MPI_Wtime();

The working of the micro-benchmark application for Redis is also very similar to the one described above and for brevity, we skip the code sections for the Redis micro-benchmark.

As our Redis client library, we have used Hiredis. Hiredis is a compact C client library for the Redis server. Hiredis is the official C client library recommended by and it is thread-safe with built-in write replication, auto-reconnect, and a couple of other useful features.

Thus, using these two benchmarks, we performed measurements to analyze and compare the performance of Memcached and Redis. In the next section, we give technical details of the hardware and software resources used.

40 3.1.1.1 Technical Data

For the analysis of our benchmark we have used the crill cluster at the University of

Houston. The crill cluster consists of 16 nodes with four 12-core AMD Opteron (Magny

Cours) processors each (48 cores per node, 768 cores total), 64 GB of main memory and two dual-port InfiniBand HCAs per node. The cluster has a PVFS2 (v2.8.2) parallel file system with 15 I/O servers and a stripe size of 1 MB. The file system is mounted onto the compute nodes over the second InfiniBand network interconnect of the cluster. The cluster utilizes SLURM as a resource manager. For development we have used the OpenMPI library (version 2.0.1), Memcached (version 1.4.20), Redis (version 3.2.8), Libmemcached

(version 1.0.18), and Hiredis (version 1.0.0)

In the next few sections, we explain in detail the process that we have used to analyze and compare the performance of Memcached and Redis using the benchmark applications.

3.1.2 Comparison of Memcached and Redis using our Micro-benchmark

Integrating a database into a mission critical application is often a huge decision and organizations typically invest a lot of effort in selecting one that suits their needs. Any such analysis on databases is incomplete without taking into consideration how well it performs in terms of speed. The amount of time taken to store and retrieve data is one of the main parameters affecting the efficiency of a database. Hence, our benchmarks focus mainly on the time taken to store and retrieve a pre-determined amount of data.

However, there can be many factors that affect how fast data is stored and retrieved from the database. The major parameters that we are concerned with are as follows:

• Responsiveness.

To test for responsiveness, we vary the number of processes storing and retrieving

41 data to/from the database servers. We believe that this experiment will give us an

idea of how well a server handles parallel requests coming in from multiple client

applications. Ideally, even as the number of parallel client requests increases, the

database should stay responsive. This will ensure that even if clients work together

to complete a single huge task, the performance is not hampered.

• Scalability.

To test for scalability, we vary the number of Memcached/Redis server instances

running in the cluster. This will help us gain insights about how well a database

performs load balancing. We believe that, as the number of servers increases, data is

also distributed evenly among increasing number of server. Hence the time taken by

an individual server to search for a data item and return it to the client should also

go down. This will in turn lead to lesser execution times.

• Functionality in case of varying data load.

This experiment is aimed at understanding how well a database performs irrespective

of the size of data to be stored/fetched. To do this, we incrementally vary the size

of the value to be stored and retrieved from the database. We expect that as data

sizes increase, the execution times will also increase. The main of this experiment if

to test that both Memcached and Redis perform well despite increasing data loads.

We believe that analyzing Memcached and Redis based on the above three criteria will give us an overall understanding of their performance. It will also help quantify the overall performance levels of the two databases. We have executed our micro-benchmarks on the crill cluster, keeping in mind the above parameters. In the next few sections, we will examine and compare the results that we have observed.

42 3.1.2.1 Varying the Number of Client Processes

In this analysis, our main aim is to observe the performance of Memcached and Redis in the face of parallel data requests. To do this, we gradually increase the number of client processes making requests to the servers, while keeping all other aspects of the application

fixed. This experiment has been performed in two parts. In the first part, we vary the number of client processes while keeping the value size fixed at 1 KB. In the second part we vary the clients and keep the value size fixed at 32 KB. The reasoning for this two-part evaluation is explained in the following subsections.

3.1.2.1.1 Using Values of Size 1 KB

For this case, we have generated, stored and retrieved 100,000 key-value pairs where each key is 20 characters long and each value is of size 1KB. We have used eight Memcached and Redis server instances. The number of MPI processes is varied from 1 to 64 in steps of powers of 2. The processes work together to store and retrieve the data. We have recorded three readings for storage and retrieval times and reported the minimum. The minimum storage and retrieval times observed in each case is given in Table 3.1:

Table 3.1: Time taken to store and retrieve data when number of client processes is varied.

43 Figure 3.1, shows a comparison of the data storage and retrieval times for Memcached and Redis.

100 30 Memcached Memcached 80 Redis Redis 20 60

40 10 20

0 0 Min. time taken to store data (sec) 0 20 40 60 0 20 40 60 Min. time taken to retrieve data (sec) No. of client processes No. of client processes

Figure 3.1: Time Taken to Store and Retrieve Data When the Number of Client Processes is Varied.

As observed in Figure 3.1, for both Memcached and Redis, we see that for storing and retrieving data, the time taken to store and retrieve data decreases as the number of processes increases. However, we see that towards the end, the performance of Memcached is significantly worse than Redis. This leads us to conclude that Memcached is unable to keep up as the number of simultaneous client requests increases beyond a certain threshold.

Also, when we compare the performance of Memcached and Redis, we can clearly see that

Redis gives better storage and retrieval times as compared to Memcached.

3.1.2.1.2 Using Values of Size 32 KB

In production-level applications, data size typically exceeds 1 KB. Hence, to get an idea of how Memcached and Redis would perform while integrated with a regular application, we

44 decided to generate, store and retrieve 100,000 key-value pairs where each key is 20 char- acters long and each value is of size 32 KB. For this analysis, we have used 16 Memcached and Redis server instances. As in the previous case, the number of processes is varied from

1 to 64 in steps of powers of 2. We have recorded three readings for storage and retrieval times and reported the minimum. The minimum storage and retrieval times observed in each case is given in Table 3.2:

Table 3.2: Time taken to store and retrieve data when number of client processes is varied.

As seen in Figure 3.2, we compare the storage and retrieval time for Memcached and

Redis, when the data size is 32 KB and the number of client processes is varied.

45 200 200 Memcached Memcached Redis Redis 150 150

100 100

50 50

0 0 Min. time taken to store data (sec) 0 20 40 60 0 20 40 60 Min. time taken to retrieve data (sec) No. of client processes No. of client processes

Figure 3.2: Time Taken to Retrieve Data When the Number of Client Processes is Varied.

As seen in Figure 3.2, the time taken to store and retrieve 100,000 key-value pairs (each value of 32 KB) follows the same pattern as the one where we used 1 KB values. How- ever, in this case Memcached performs significantly worse than Redis. While storing and retrieving data, we observed that, despite initial spikes, overall as the number of clients is increased, the storage and retrieval time also gradually decreases. Also, when the value size was increased to 32 KB, we noticed a considerable amount of cache misses for Memcached.

The reason for these misses is the fact that Memcached does not back data to a secondary store and it is purely an in-memory key-value store.

Thus, for both of the above cases (data of size 1 KB and 32 KB), we conclude that Redis is better at handling parallel client requests. Redis also performs better than Memcached while storing and retrieving data. We observed that Redis was more reliable than Mem- cached and that it strives to achieve data availability in most cases irrespective of data size.

In the next section, we will present the results observed while testing for the scalability of both databases.

46 3.1.2.2 Varying the Number of Server Instances

We now run the second experiment by varying the number of Memcached and Redis server instances running in the cluster. As part of this experiment, we generate, store and retrieve

100,000 key-value pairs with each key of 20 characters and each value of size 1 KB. We run this experiment using 16 MPI processes and all 16 of them will share the load of storing and fetching the data. The number of server instances used are 1, 2, 4, 8, 12, and 16. We have recorded three readings for storage and retrieval times and reported the minimum.

The minimum storage and retrieval times observed in each case is given in Table 3.3:

Table 3.3: Time taken to store and retrieve data when the number of servers is varied.

As seen in Figure 3.3, we compare the performance of both databases when the number of servers are varied.

47 4 4 Memcached Memcached Redis Redis 3 3

2 2

1 1

0 0 Min. time taken to store data [sec] 0 5 10 15 20 0 5 10 15 20 Min. time taken to retrieve data (sec) No. of servers No. of servers

Figure 3.3: Time Taken to Store and Retrieve Data When the Number of Servers is Varied.

From the above graphs, we can see a downward trend in the time taken to store and retrieve the data. This indicates positively towards the scalability and the load-balancing abilities of both Memcached and Redis. However, in this case too, we found that Redis out-performs Memcached.

3.1.2.3 Varying the Size of the Value

The previous two cases focused mainly on analyzing the responsiveness and scalablity of

Memcached and Redis. In this case, we subject both Memcached and Redis to increasing levels of data load and analyze how well they perform regular functions like storing and retrieving data. For this case we have generated, stored and retrieved 100,000 key-value pairs where each key is 20 characters long. We have used 16 Memcached and Redis server instances and 16 MPI processes. All the MPI processes are equally responsible for handling the load. The value size is varied from 1 KB to 64 KB in steps of powers of 2. We have recorded three readings for storage and retrieval times and reported the minimum. The

48 minimum storage and retrieval times observed in each case is given in Table 3.4:

Table 3.4: Time taken to store and retrieve data when the size of the value is varied.

As seen in Figure 3.4, the difference in data storage and retrieval times for Memcached and Redis when the data load is varied.

49 120 120 Memcached Memcached 100 Redis 100 Redis

80 80

60 60

40 40

20 20

0 0 Min. time taken to store data (sec) 0 2 4 6 0 2 4 6 Min. time taken to retrieve data (sec) Size of value (bytes) ·104 Size of value (bytes) ·104

Figure 3.4: Time Taken to Store and Retrieve Data when the Value Size is Varied.

In this experiment, we observed that, as the size of individual values were increased, both

Memcached and Redis gradually started taking more time to store and retrieve the data.

Figure 3.4 clearly shows a linear relationship between the size of the data and the storage and retrieval times. For values upto sizes of 1 KB, both Memcached and Redis perform reasonably well while storing and retrieving data. However we see that, as the data size is increased beyond 1 KB, the execution times double with each step. In this experiment too, we observed that Redis performs better than Memcached while storing and retrieving data. Also, in case of Memcached, we observed that, as the data size increased, the number of data misses also increased.

3.1.2.4 Observations and Final Conclusions

In this manner, we have performed a comprehensive analysis of Memcached and Redis using our OpenMPI micro-benchmark. We analyzed both key-value stores so as to gain an

50 idea about how reactive they are to varying data loads and varying number of client re- quests. We also performed experiments to test the scalability of these two databases. Both key-value stores performed fairly well in the test cases. However, as the number of client requests and the volume of data to be stored/fetched increased, the difference in perfor- mance between the two databases became apparent. Redis out-performed Memcached in all our test cases. Also, we noted that Redis was generally more reliable than Memcached in terms of data availability. This observation can be attributed to the fact that, contrary to Redis, Memcached does not have any option to back data to secondary storage. As a result, incorporating Memcached as an in-memory key-value cache into an application may lead to more cache misses as the volume of data and incoming requests increases. Taking into consideration the above results, we conclude that Redis is a much better candidate to incorporate into applications as an in-memory caching mechanism. In the next section, we present the details of the air-quality simulation application and the method that we have used to integrate Redis into this application.

3.2 Air-quality Simulation Application

In the previous section, we analyzed and compared two in-memory key-value stores, namely

Memcached and Redis. After analysis we concluded that Redis outperformed Memcached in most instances. In this section, we integrate Redis as a caching layer into a data analysis application to see if gives any significant performance benefits. The application that we are using is a Hadoop MapReduce application that is responsible for calculating the eight- hour rolling average of air-quality data gathered around sites in Houston [6]. We are using a dataset that contains information of pollutants measured by various sensors placed all across Texas, from 2009 to 2013. We are using a total of five input files, one for each year.

The total size of the dataset is 48.5 GB and all the input files are stored in HDFS. Each

51 input file is a comma-separated list of information. Each line consists of the following fields: year, month, day, hour, min, region, parameter id, parameter name, site, cams, value, and

flag. The problem that we tried to solve is to compute the eight-hour rolling average of O3 concentration in the air around sites in Houston, TX. This problem is broken down into two parts. In the first part, for every site in Houston, we calculate the hourly average.

In the second part we combine the hourly averages to calculate the eight-hourly averages.

Using Hadoop MapReduce we can solve this problem using two MapReduce jobs.

The first MapReduce job computes the average of O3 concentration around sites in Houston for every hour. The data present in the input directory is divided into blocks and given as input to the mapper. It outputs (key,value) pairs which are then used by the reducer to perform the required aggregation. The key emitted by the mapper is a combination of siteId, year, day of the year, and the hour. Only data points having the valid flag set, belonging to sites in Houston, with parameter name as O3 and whose pollutant value is not null are considered as valid data points for our measurement. The corresponding pollutant concentration is emitted by the Mapper as the value. The Reducer gets as input a subset of keys, and each key is associated with a list of values. For each key, the sum of values and the number of values associated with that keys are computed. If the frequency count for a given hour is above a certain threshold (ex: greater than five in our case), the corresponding hourly average is computed. If the frequency is less, a dummy value (“-1” in our case) is emitted so as to indicate that the value is inconsequential. The second

MapReduce job calculates the eight-hour rolling average of O3 concentration around sites in Houston. The mapper receives as input, the hourly averages computed in the previous

MapReduce job. The mapper emits eight keys that indicate the eight consecutive hours starting from the hour indicated in the input key and the average pollutant concentration value corresponding to the base hour. Special care has been taken to ensure that the hours

52 emitted by the mapper roll over after 24 hours. Every instance of the reducer, receives as input, a list of average O3 concentration values associated with a particular hour. For every hour, the sum of the averages and a frequency count is computed similar to the earlier

MapReduce job. If the total number of valid entries for a given hour is above a certain threshold (greater than six in our case), the corresponding eight-hour rolling average is computed. If the frequency count is less, a dummy value (“NA” in our case) is emitted by the reducer to indicate an inconsequential entry.

Thus, using the above two MapReduce jobs, we have calculated the eight-hourly averages of O3 concentration. The next section describes our reasoning for integrating Redis as a caching layer into this application, and gives details of how this was achieved.

3.3 Integration of Redis in Hadoop

In the previous section, we described the air-quality MapReduce application in detail and pointed out that input data to the application comes from HDFS and the output data is written to HDFS. However, intermediate data, like the data passed from the first MapRe- duce job to the second MapReduce job as well as the data passed on from the Mapper to the Reducer is also written to HDFS. We believe that introducing an in-memory key-value store as a caching layer may boost the performance of this application, because data will be read in from RAM and not from the disk. To test this hypothesis, we have decided to incorporate Redis as an in-memory cache in the air-quality application. To do this, we have customized the data input source and output destinations to suit our requirements.

In the previous chapter, we discussed that, when a MapReduce job starts, each input file is divided into splits and each of these splits is assigned to an instance of the Mapper.

Each split is further divided into records of key-value pairs which are then processed by the Mapper. The ’InputFormat’ class is responsible for configuring how contiguous chunks

53 of input are generated from blocks in HDFS (or other sources). This class also provides a ’RecordReader’ class that generates key-value pairs from each individual split. Hadoop provides a set of standard InputFormat classes, but in our case, we use our own Input-

Format and RecordReader classes so as to read in data from Redis. Similarly, to write our data to Redis instead of HDFS, we need to provide our own implementation of the

RecordWriter class.

In the new application, we will still have two MapReduce jobs where the first job calculates the hourly averages and the second calculates the eight-hourly average. The flow of the new application will be as follows:

• The Mapper of the first MapReduce job reads data from the input file stored in

HDFS and emits (key, value) pairs which are then used by the reducer to perform

the required aggregation.

• The Reducer calculates and emits the corresponding hourly averages to Redis instead

of HDFS. To write to Redis, we use our own customized RecordWriter as follows:

Figure 3.5: Customized RecordWriter to Read in Data from Redis

• The Mapper of the second job, reads in the hourly averages from Redis and emits

eight keys that indicate the eight consecutive hours starting from the hour indicated

in the input key and the input average value corresponding to the base hour. The

data is emitted to Redis instead of HDFS. To acheive this we implement our own

RecordReader as follows:

54 Figure 3.6: Customized RecordReader to Write Data to Redis

• Finally, the Reducer of the second MapReduce job, reads in the output of the previous

step from Redis and calculates and emits the final eight hourly average to HDFS.

To integrate Redis into our Hadoop application, we make use of a Java Redis client library called Jedis which is the officially recommended Java client by Redis Labs. We have then implemented customized RecordReader and RecordWriter classes to read/write data to/from Redis using Jedis.

In this way, we have concluded the description of the air-quality simulation application and our own customized version using Redis. In the next section, we present the details of the hardware and software resources used for our analysis.

3.3.1 Technical Data

The Whale cluster located at the University of Houston is used to perform analyses for the research work. It has 57 Appro 1522H nodes (whale-001 to whale-057). Each node has two

2.2 GHz quad-core AMD Opteron processors (8 cores total) with 16 GB main memory and

Gigabit Ethernet. The cluster uses a 144 port 4xInfiniBand DDR Voltaire Grid Director

ISR 2012 switch and a two 48 port HP GE switch for the network interconnect. For the storage, a 4 TB NFS /home file system and a 7 TB HDFS file system (using triple replication) is used. For development we have used Hadoop (version 2.7.2), Redis (version

3.2.8) and Jedis (version 2.8)

55 In the next section we compare the performance of both air-quality simulation applications described previously and present our conclusion.

3.4 Results and Comparison

In the previous section, we discussed in detail the Hadoop air-quality application and also our customized implementation with in-memory caching. In this section, we analyze the performance of the two applications with respect to the time taken to complete execution.

We then compare the execution times of both applications to see if integrating Redis as a caching layer provides any benefits.

To perform our analysis, we have used the whale cluster at the University of Houston. For our analysis, we have varied the number of reducers from 1 to 20 in steps of 5. We have executed both applications three times on the whale cluster and reported the minimum of the three. The results that we observed for the original air-quality application are in

Table 3.5:

Table 3.5: Time taken to execute original air-quality application

No. of Reducers Execution time (min)

1 5min, 9sec

5 3min, 33sec

10 3min

15 2min, 42sec

20 2min, 41sec

56 The execution times that we observed for the air-quality application in which we integrated

Redis are in Table 3.6:

Table 3.6: Time taken to execute the air-quality application using Redis

No. of Reducers Execution time (min)

1 5min, 49sec

5 3min, 43sec

10 3min, 43sec

15 2min, 45sec

20 2min, 44sec

Figure 3.7 will enable us to understand the execution timings better. In the graph, we have compared the total execution time taken by both air-quality applications.

6 HDFS Redis 5

4

3 Execution time (min)

1 5 10 15 No. of Reducers

Figure 3.7: Comparison of Execution Times (in minutes) for Air-quality Applications Using

HDFS and Redis.

57 Contrary to our expectations, we observed that integrating Redis into our application did not provide any added performance benefits. In fact, the total time taken by the application using in-memory caching is more as compared to the original application. We believe that the delay is being introduced due to the fact that we are using a single Redis hash to store data. As a result, this becomes a bottleneck when a client tries to write multiple key-value pairs to the database. When a client wants to write data to Redis, it connects to a Redis server instance and demands access to the hash. This client will then wait till it receives a response from the server before sending the next request. Essentially all requests from a single client are serialized and delay is introduced in completing the requests. When we use more than one client, the delays get accrued and we see poor performance. To solve this problem, Redis provides an advanced feature called pipelining [35]. Using Redis pipelining it is possible to send multiple commands to the server without waiting for replies from the server. This essentially means that a client buffers up a bunch of commands and ships them to the server in one go. The benefit here is that we save network round trip time for every command. However, due to lack of proper documentation for Jedis and time constraints, we could not explore this option of pipelining client requests, but we wish to continue exploring this option. We believe that introducing pipelining will give better results, and we will see the true benefits of using Redis as a caching layer in scientific applications. With this, we conclude the analysis and result section of this thesis and in the next chapter, we conclude this thesis by summarizing our analysis, observations and

findings.

58 Chapter 4

Conclusions and Outlook

In recent years, the industry as well as academia has faced an unprecedented data explo- sion and performing analyses on these large datasets is becoming increasingly common.

Data analysis is performed so as to find previously unknown correlations between datasets, however, at the same time there is a tremendous need to make proper use of the available computing resources. Also, traditional RDBMS databases are unable to keep up with the huge volume of data that is being generated. To complicate matters further, data being generated is obtained from various sources and may be structured or unstructured. NoSQL databases overcome many of the shortcomings of RDBMS systems and have emerged as a solution to store and analyze big data. There are many types of NoSQL databases and lately key-value NoSQL databases are being increasingly used due to their simplicity and ease of use. In-memory key-value stores are a special kind of key-value databases that re- tain data in main memory instead of on secondary storage. This is done so as to speed-up access to data. As a result, they are being used in compute intensive applications as an intermediate caching layer to store intermediate and final results. This ensures faster read times and hence enhances the performance of the application. The main focus of this thesis

59 is to analyze and compare the various in-memory key-value stores available in the market today.

We have analyzed popular in-memory key-value stores like Memcached, Redis, Riak, Hazel- cast, Aerospike, and MICA. We have then compared them based on features like in-memory caching, support for multiple, parallel requests, open-source, easy access from remote loca- tions etc. Based on our analysis, we were most interested in studying Redis and Memcached in detail. To do this we have developed a micro-benchmark using C and the OpenMPI library so as to analyze and compare Memcached and Redis. Based on our analysis, we concluded that Redis was more scalable and reliable as compared to Memcached. Also, we noticed Redis to be more resilient in the face of large data requests. Based on this observation, we concluded Redis to be the better of the two.

To test how well Redis performs as an in-memory cache, we have integrated it into a Hadoop

MapReduce application that measures the eight hourly average of air-quality around sites in Houston. We have used a 48.5 GB dataset that contains data collected from various sites in Texas from 2009 to 2013. This task has been achieved in two parts using two

MapReduce jobs. The first job is responsible for calculating hourly averages and the second job calculates the final eight hourly averages. The main aim was to compare the execution times of this application with a similar Hadoop application that does not use in-memory caching. Although, we observed promising results for the second part of the application we observed that integrating Redis as a caching layer did not offer any performance benefits.

However, we believe that this problem can be solved using an advanced feature called Redis pipelining and we wish to explore this further.

In the future, we are interested in benchmarking other in-memory key-value stores like Riak.

We also want to integrate Memcached as a caching layer into a data analysis application to observe it’s performance in a real-world scenario. Further, we would love to explore

60 other components of the NoSQL eco-system so as improve the analytical abilities of big data applications.

61 Bibliography

[1] Rick Cattell. Scalable sql and data stores. SIGMOD Rec., 39(4):12–27, May 2011. [2] Ameya Nayak, Anil Poriya, and Dikshay Poojary. Type of nosql databases and its comparison with relational databases. International Journal of Applied Information Systems, 5(4), March 2013. Published by Foundation of Computer Science, New York, USA.

[3] Key-value database - wikipedia. https://en.wikipedia.org/wiki/Key-value_ database. [Online; accessed 16-Mar-2017]. [4] Brad Fitzpatrick. Distributed caching with memcached. J., 2004(124):5–, Au- gust 2004.

[5] Redis. https://redis.io/. [Online; accessed 23-Dec-2016]. [6] Haripriya Ayyalasomayajula, Edgar Gabriel, Peggy Lindner, and Daniel Price. Air quality simulations using big data programming models. In Big Data Computing Ser- vice and Applications (BigDataService), 2016 IEEE Second International Conference on, pages 182–184. IEEE, 2016. [7] Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST ’10, pages 1–10, Washing- ton, DC, USA, 2010. IEEE Computer Society. [8] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: Simplified data processing on large clusters. Commun. ACM, 51(1):107–113, January 2008.

[9] Introduction to big data: Types, characteristics & benefits. http://www.guru99. com/what-is-big-data.html. [Online; accessed 04-Nov-2016]. [10] Acid - wikipedia. https://en.wikipedia.org/wiki/ACID. [Online; accessed 21- February-2017].

[11] The programming language lua. https://www.lua.org/. [Online; accessed 27-Dec- 2016].

62 [12] Using redis as an lru cache redis. https://redis.io/topics/lru-cache. [Online; accessed 24-Dec-2016].

[13] Gossip protocol - wikipedia. https://en.wikipedia.org/wiki/Gossip_protocol. [Online; accessed 11-Jan-2017].

[14] Memcached - a distributed memory object caching system. https://memcached.org/. [Online; accessed 27-Nov-2016]. [15] Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. Scaling memcache at facebook. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Imple- mentation, nsdi’13, pages 385–398, Berkeley, CA, USA, 2013. USENIX Association.

[16] Riakkv enterprise technical overview. http://info.basho.com/rs/721-DGT-611/ images/RiakKV\%20Enterprise\%20Technical\%20Overview-6page.pdf. [Online; accessed 01-Feb-2017].

[17] Consistent hashing - wikipedia. https://en.wikipedia.org/wiki/Consistent_ hashing. [Online; accessed 31-Dec-2016]. [18] An architect’s view of hazelcast imdg - hazelcast.com. https://hazelcast.com/ resources/architects-view-hazelcast/. [Online; accessed 02-Feb-2017]. [19] Java virtual machine - wikipedia. https://en.wikipedia.org/wiki/Java_virtual_ machine. [Online; accessed 28-Feb-2017]. [20] Hazelcast documentation. http://docs.hazelcast.org/docs/3.3/manual/pdf/ hazelcast-documentation-3.3.5.pdf. [Online; accessed 03-Feb-2017]. [21] Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. Mica: A holistic approach to fast in-memory key-value storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 429–444, Seattle, WA, 2014. USENIX Association.

[22] Data plane development kit. http://dpdk.org/. [Online; accessed 03-Mar-2017]. [23] Non-uniform memory access - wikipedia. https://en.wikipedia.org/wiki/ Non-uniform_memory_access. [Online; accessed 07-Mar-2017]. [24] Mica: A holistic approach to fast in-memory key- value storage. http://www.slideserve.com/schuyler/ mica-a-holistic-approach-to-fast-in-memory-key-value-storage. [On- line; accessed 04-Feb-2017].

[25] Aerospike architecture. http://www.aerospike.com/docs/architecture. [Online; accessed 04-Mar-2017].

63 [26] Db-engines ranking - popularity ranking of key-value stores. https://db-engines. com/en/ranking/key-value+store. [Online; accessed 03-Feb-2017].

[27] Mpi: A message-passing interface standard. http://mpi-forum.org/docs/mpi-3.1/ mpi31-report.pdf. [Online; accessed 13-Dec-2016].

[28] Application programming interface - wikipedia. https://en.wikipedia.org/wiki/ Application_programming_interface. [Online; accessed 02-March-2017].

[29] Inter-process communication - wikipedia. https://en.wikipedia.org/wiki/ Inter-process_communication. [Online; accessed 26-Feb-2017].

[30] Apache hadoop. http://hadoop.apache.org/. [Online; accessed 10-Feb-2017].

[31] Locality of reference - wikipedia. https://en.wikipedia.org/wiki/Locality_of_ reference. [Online; accessed 21-February-2017].

[32] Apache hadoop yarn. https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/ hadoop-yarn-site/YARN.html. [Online; accessed 17-Apr-2017]. [33] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, SOSP ’03, pages 29–43, New York, NY, USA, 2003. ACM.

[34] Hadoop word count example. https://cs.calvin.edu/courses/cs/374/ exercises/12/lab/. [Online; accessed 12-Dec-2016].

[35] Redis pipelining. https://redis.io/topics/pipelining. [Online; accessed 12-Apr- 2017].

[36] M. Berezecki, E. Frachtenberg, M. Paleczny, and K. Steele. Many-core key-value store. In Proceedings of the 2011 International Green Computing Conference and Workshops, IGCC ’11, pages 1–8, Washington, DC, USA, 2011. IEEE Computer Society.

[37] Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Work- load analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIG- METRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS ’12, pages 53–64, New York, NY, USA, 2012. ACM.

[38] Shared memory hash table vishesh handa’s blog. http://vhanda.in/blog/2012/ 07/shared-memory-hash-table/. [Online; accessed 01-Dec-2016]. [39] Tom White. Hadoop: The Definitive Guide . O’ReillyMedia, Inc., second edition, October 2010.

[40] Thilina Gunarathne Srinath Perera. Hadoop MapReduce Cookbook . Packt Publishing, first edition, February 2013.

64 [41] Introduction to mapreduce and hadoop. http://people.csail.mit.edu/matei/ talks/2010/amp_mapreduce.pdf. [Online; accessed 17-Apr-2017].

[42] Edgar Gabriel. Cosc 6374 parallel computation, fall 2015. http://www2.cs.uh.edu/ ~gabriel/courses/cosc6374_f15/index.shtml. [Online; accessed 17-Apr-2017]. [43] Edgar Gabriel. Cosc 6339 big data analytics, spring 2015. http://www2.cs.uh.edu/ ~gabriel/courses/cosc6339_s15/index.shtml. [Online; accessed 17-Apr-2017]. [44] Rdbms, 2016. [Online; accessed 28-November-2016].

[45] Emilio Coppa. Hadoop architecture overview. http://ercoppa.github.io/ HadoopInternals/HadoopArchitectureOverview.html. [Online; accessed 17-Apr- 2017].

[46] Open mpi: Open source high performance computing. https://www.open-mpi.org/. [Online; accessed 17-Apr-2017].

[47] Jeffrey M. Squyres. The architecture of open source applications (volume 2): Open mpi. http://www.aosabook.org/en/openmpi.html, 2015. [Online; accessed 17-Apr- 2017].

65