c 2017 by the authors; licensee RonPub, Lubeck,¨ Germany. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Open Access Open Journal of Cloud Computing (OJCC) Volume 4, Issue 1, 2017 http://www.ronpub.com/ojcc ISSN 2199-1987 Performance Aspects of Object-based Storage Services on Single Board Computers Christian Baun, Henry-Norbert Cocos, Rosa-Maria Spanou

Faculty of Computer Science and Engineering, Frankfurt University of Applied Sciences, Nibelungenplatz 1, 60318 Frankfurt am Main, Germany [email protected], [email protected], [email protected]

ABSTRACT When an object-based storage service is demanded and the cost for purchase and operation of servers, workstations or personal computers is a challenge, single board computers may be an option to build an inexpensive system. This paper describes the lessons learned from deploying different private cloud storage services, which implement the functionality and API of the Amazon Simple Storage Service on a single board computer, the development of a lightweight tool to investigate the performance and an analysis of the archived measurement data. The objective of the performance evaluation is to get an impression, if it is possible and useful to deploy object-based storage services on single board computers.

TYPEOF PAPERAND KEYWORDS

Regular research paper: single board computers, private cloud, object-based storage services, simple storage service

1 INTRODUCTION computer systems are not only well suited for many cloud computing applications, but also for Internet of A specific sort of cloud services, which belongs to the Things (IoT) [19] [47] scenarios and more modern Infrastructure as a Service (IaaS) delivery model, is the distributed systems architecture paradigms like Fog object-based storage services. Examples for public cloud Computing [7] [25] and Dew Computing [39] [45]. offerings, which belong to this sort of services, are The drawback of single board computers is the limited the Amazon Simple Storage Service (S3) and Google hardware resources, which cannot compete with the Cloud Storage, which also implements the S3-API. performance of higher-value systems [3]. For this work, Furthermore exist several free private cloud solutions, the Raspberry Pi 3 single board computer was selected which implement the S3-functionality and API. as hardware platform, because the purchase cost for For research projects, educational purposes like this device is just approximately 40 e and operating student projects and in environments with limited system images of several different distributions space and/or energy resources, single board computers exist for its architecture. The computer provides four may be a useful alternative to commodity hardware CPU cores (ARM Cortex A8), 1 GB of main memory servers, because they require lesser purchase costs and a 10/100 Mbit Ethernet interface. A microSD card is and operating costs. In addition, such single board used as local storage. computer systems can be constructed in a way that they In [2] we already investigated the characteristics, can easily be transported by the users [2] because of as well as the performance and energy-efficiency of a their low weight and compact design. Single board mobile cluster of eight Raspberry Pi 1 Computers and

1 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017

its single components. In this work we also mentioned the S3 API, but also the Swift API and it can simulate useful applications for such a system. In [3] we analyzed different sorts of workload. It is among others possible and compared the performance and energy-efficiency to specify the number of workers, which interact with of different clusters of Single Board Computers with the storage service, and the read/write ratio of the access the High Performance Linpack (HPL) [20] benchmark. operations [49]. The complexity of COSBench is also This work covered clusters of Raspberry Pi 1, BananaPi a drawback, because the installation and configuration and Raspberry Pi 2 nodes. For this work, several free requires some effort. private cloud S3-reimplementations have been evaluated and deployed on a Raspberry Pi 3 when it was possible. McFarland [43] implemented in 2013 two Python The performance and energy efficiency have been scripts, which make use of the boto [8] library to measure investigated among others with S3perf, a self-developed the download and upload data rate of the Amazon S3 tool that investigates the required time to carry out service offering for different file object sizes. Those typical storage service operations. solutions offer only little functionality. They just allow This paper is organized as follows. Section 2 to measure the required time to execute upload and contains a discussion of related work. In Section 3, the download operations sequentially. Parallel operations deployment of different private cloud storage services are not supported and also fundamental operations other on the Raspberry Pi 3 is described. Section 4 presents than the upload and download objects are not considered. the development and implementation of the tool S3perf. Furthermore, the solution does not support the Swift An analysis of the gained measurement data is done in API. Section 5. Section 6 presents conclusions and directions for future work. Land [26] analyzed in 2015 the performance of different public cloud object-based storage services with 2 RELATED WORK files of different sizes by using the command line tools of the service providers and by mounting buckets of the In the literature, several works cover the topic of services as file systems in user-space. This work does measuring the performance of storage services with the not analyze the performance and functionality of private S3 interface. cloud scenarios. The author uploaded for his work a Garfinkel [18] evaluated in 2007 the throughput via file of 100 MB in size and a local git repository with HTTP GET operations which Amazon S3 can deliver several small files into the evaluated storage services and with objects of different sizes over several days from afterwards erased the files, but in contrast to our work, he several locations by using a self-written tool. The did not investigate the performance of the single storage work is focused on download operations and does not service related operations in detail. analyze the performance of other storage service related operations in detail. Unfortunately, this tool has never been released by the author and the work does not Bjornson [6] measured in 2015 the latency – time investigate the performance and functionality of private to first byte (TTFB) – and the throughput of different cloud scenarios. public cloud object-based storage services by using a self-written tool. Unfortunately, this tool has never been Palankar et al. [32] evaluated in 2008 the ability released by the author. The work does not consider of Amazon S3 to provide storage support to large- the typical storage service related operations and it does scale science projects from a cost, availability, and not analyze the performance and functionality of private performance perspective. Among others, the authors cloud scenarios. evaluated the throughput (HTTP GET operations) which S3 can deliver in single-node and multi-node scenarios. They also measured the performance from different In contrast to the related works in this section, we remote locations. No used tool has been released by the evaluate the performance of private cloud solutions authors and the work does not mention the performance on single board computers and not the performance and functionality of private cloud scenarios. of one or few public cloud service offerings. In Zheng et al. [50] described in 2012 and 2013 the Cloud addition, we developed and implemented a flexible and Object Storage Benchmark (COSBench) [12], which is lightweight solution to analyze the performance of the able to measure the performance of different object- most important storage service operations and not only based storage services. It is written in Java and provides of the HTTP GET operation or of the latency. We used a web-based user interface and helpful documentation this tool to analyze the performance of storage service for users and developers. The tool supports not only solutions, which has also been released as free software.

2 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

3 DEPLOYMENTOF PRIVATE CLOUD 3.1 Ceph-RGW STORAGE SERVICESWITHTHE S3- INTERFACE ON A SINGLE BOARD The distributed file system Ceph [46] supports among COMPUTER others the Amazon S3 REST API [11]. This functionality of Ceph is also called Ceph Object For this work, several free private cloud storage services, Gateway or RADOS Gateway (RGW). Ceph is free which re-implement the functionality of the Amazon S3 software and licensed according to the GNU Lesser and its API, have been taken into consideration. Table 1 General Public License (LGPL). presents an overview of the investigated storage services Despite several success stories [1] [13] [14] [44] in and their characteristics. literature, which describe the deployment of Ceph on Some of the investigated services (Cumulus, Fake Raspberry Pi 3 computers, our attempts to install and S3, S3ninja, S3rver, Scality S3 Server and Walrus) configure Ceph inside the used Raspbian revision did not support only single-node deployments, which limits the result in a stable testbed. Therefore, for this work, Ceph reliability and scalability of these services in principle. was not further evaluated. These services focus on providing lightweight solutions, mainly for testing and development purposes. Ceph- 3.2 Cumulus RGW, Minio, Riak CS and Swift are designed to implement multi-node deployments. The storage service Cumulus [9] has been developed in The table among others highlights the programming the context of the Nimbus infrastructure project [24]. languages, in which the different storage service Cumulus is developed in Python and its source code is solutions are implemented in. This information may licensed according to the Apache License 2.0 [30]. The be important for application scenarios where additional Cumulus service allows only single-node deployments. functions need to be implemented by the users. Because Cumulus does not rely on any higher level Furthermore, in practice, not all systems provide all Nimbus libraries, it is possible to install it as a stand compilers or runtime environments. alone storage service without the entire Nimbus IaaS When implementing a single-node service, the solution. durability of the stored data depends entirely on the In order to install Cumulus, the storage system, which is used. It is possible to use these must provide the Python interpreter 2.5 or a more services as a front end and via the S3 interface to access recent revision. The single installation steps are well documented in the Cumulus documentation inside the • a folder inside a POSIX file system, which is QUICKSTART.txt file, which can be found inside the located on a local connected storage drive, Cumulus source code repository. In case of a Cumulus only installation, the • a distributed file system, which spans via multiple user management functionality of the Nimbus physical servers, infrastructure cannot be used. For such scenarios, Cumulus provides a set of command line tools • a network attached storage (NAS) or even (cumulus-add-user, cumulus-list-users and cumulus-remove-user) to create, remove, and • a storage area network (SAN). print out a list of current users. After the first start of the service, all relevant When using a multi-node deployment with Ceph- configuration data is stored inside the file RGW, Minio, Riak CS or Swift, the data is replicated ˜/cumulus/etc/cumulus.ini. Among others, over several nodes by the storage service itself. the following parameters are specified: The port number Whenever possible, the services, described in this of Cumulus, the path of the folder, which is used to store work have been deployed on a Raspberry Pi 3 single the buckets and objects, the hostname, the path and file board computer. The operating system used was name of the logfile and if HTTPS shall be used. Raspbian GNU/Linux 8.0 – which is a Jessie derivate – with release date 2017-03-02. The used Linux 3.3 Fake S3 kernel was revision 4.4.50. For this work, all deployed storage services used a local folder on the microSD The Fake S3 service is developed in the programming card, which is connected to the Raspberry Pi 3, to store language Ruby and until revision v0.2.5, the source code the buckets and objects. In order to investigate the is licensed according to the MIT License [17]. Later performance of the services, the required time to carry revisions are not licensed according to a out a set of operations was measured. which complies with the open source definition of the

3 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017

Table 1: Storage services which implement the S3-API and their characteristics

Service Tested revision Multi-node Programming License deployment language Ceph-RGW v10.2.7 possible C++ LGPL Cumulus v2.10.1 impossible Python Apache License 2.0 Fake S3 v1.0.0 impossible Ruby MIT License Minio v2017-03-16 possible Go Apache License 2.0 Riak CS v2.1.1 possible Erlang Apache License 2.0 S3ninja v2.7 impossible Java MIT License S3rver v0.0.7 impossible Node.js MIT License Scality S3 Server v7.0.0 impossible Node.js Apache License 2.0 Swift (with the swift3 middleware) v2.13.0 possible Python Apache License 2.0 Walrus v4.4.0 impossible Java GPLv3

Open Source Initiative (OSI). The service offers only messages on command line and the logging level and if single-node deployments and no graphical user interface. the server shall write messages in a logfile and the file In order to deploy Fake S3, the programming language name as well as the logging level. Ruby and the package manager RubyGems need to be Since November 2016, Minio provides multi-disk installed first. The installation of Fake S3 is done via the support with internal replication on single node command gem install fakes3. Important service deployments and across multiple nodes. To avoid a related parameters like the port number and the path single node being a bottleneck for requests, a tool like of the S3 data folder are provided as command line the light-weight web server software nginx can be used parameters when starting the Fake S3 service. as a proxy [27]. During the evaluation of Fake S3, some bugs have been observed with this service. Buckets are still seen 3.5 Riak CS by the service, even if they have been erased prior. With the evaluated revisions, it was not possible to modify the Riak CS (Cloud Storage) implements a S3-like object access key and the secret access key via command line storage with the S3-API on top of Riak KV [29], which parameters or to specify them via environment variables. is a distributed NoSQL key-value data store. Riak KV implements the characteristics of Amazon Dynamo [15] 3.4 Minio and has therefore a focus on multi-node scalability, fault-tolerance and data durability thanks to internal The Minio service is developed in the programming replication. language Go and its source code is licensed according The evenly distribution of stored data is carried out by to the Apache License 2.0 [28]. Minio provides a web Riak KV fully automatic. To achieve this characteristic, user interface, which allows to carry out all object and Riak KV places the keys inside a Distributed Hash Table bucket related tasks. The Minio project offers not only (DHT) [23]. The DHT is split into partitions, circularly the source code of the service, but also binaries for distributed among the nodes, and replicas are always the operating systems Mac OS X, Linux, FreeBSD and placed in contiguous partitions [10]. . Linux binaries for x86-compatible To enforce the global uniqueness of entities, a Riak CS architectures and even ARM architectures are available. storage system also requires the service Stanchion [41], When starting the service, the port number and a local which serializes the requests that involve creation and folder for the object data, are provided as command modification of the entities. When a user for instance line parameters. The user access key and secret access tries to create a bucket with an already existing bucket key can be specified via the environment variables name, this request is rejected by Stanchion. The same MINIO_ACCESS_KEY and MINIO_SECRET_KEY. result is caused by the attempt to create a user with the During the first start of the service, all Email address of an already existing user [22]. relevant configuration data is stored inside the file Riak KV, Stanchion and Riak CS are implemented in ˜/.minio/config.json. Inside this file, among the programming language Erlang and are free software. others the following parameters are specified: The The source code [34] is licensed according to the Apache user access key and secret access key, if the web user License 2.0. The installation of Riak KV, Stanchion interface shall be used, if the server shall print out and Riak CS, as well as a compatible revision of the

4 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

Erlang compiler, requires a lot of effort, because no pre- specified, the port number of the service only accepts compiled packages were available for Raspbian and the connections from localhost. ARM architecture of the Raspberry Pi. Compiling all With the evaluated revisions, it was not possible required components requires more than an hour on a to modify the access key and the secret access key Raspberry Pi 3. The services Riak KV, Stanchion and via command line parameter or to specify them via Riak CS have their own configuration files and need to environment variables. be started in the correct order. With Riak CS Control, a standalone web application 3.8 Scality S3 Server for the user management of Riak CS systems exists. Equal to the other Riak components already described, The Scality S3 Server, which offers only single-node Riak CS Control is free software and implemented in deployments, is developed in the programming language Erlang. Unfortunately, all attempts to deploy Riak CS JavaScript and its source code is licensed according to Control on the Raspberry Pi 3 failed in this work. the Apache License 2.0 [38]. The software is executed on server-side by using the Node.js JavaScript runtime environment. If Node.js and the package manager npm 3.6 S3ninja are already installed, the service can be deployed via the command npm install. The S3 ninja service is developed in the programming One characteristic of Scality S3 Server is that its language Java and its source code is licensed according main memory consumption exceeds the physical main to the MIT License [36]. The service offers only single- memory of the Raspberry Pi 3. In order to solve this node deployments. S3ninja provides a web user interface issue, a Swap partition of at least 1 GB of size or a Swap which allows to carry out all object and bucket related file is required. tasks. The user access key and secret access From a perspective of operating system, S3ninja key can be specified via the environment requires just an installed Java runtime environment. In variables SCALITY_ACCESS_KEY_ID and effect, the deployment and start of this service solution SCALITY_SECRET_ACCESS_KEY or inside the require little effort. An unexpected issue arose from the file ˜/S3/conf/authdata.json. Further relevant interaction of S3ninja with s3cmd: The used s3cmd configuration parameters like the logging level and the revision v2.7 forces the API to be accessible at path /, port number of the service are specified inside the file but S3ninja only permits the path /s3. This issue was ˜/S3/config.json. solved by installing nginx and using it as a proxy. In the configuration file of the service ˜/s3ninja/app/application.conf, the 3.9 Swift user access key and the secret access key can be Swift is a component of the IaaS solution OpenStack. specified. Also the path of the S3 data folder and the It is written in Python and licensed according to the port number is specified inside this file. Apache License 2.0 [31]. The OpenStack project was initiated in 2010 by Rackspace and NASA with 3.7 S3rver the objective to develop a scalable, durable and high- available object storage, which can be deployed across The service S3rver is developed in the programming a large number of nodes. The swift service ensures data language JavaScript and its source code is licensed replication and integrity across the connected nodes. according to the MIT License [37]. The software is Although being an object-storage service like the executed on server-side by using Node.js. The service other services examined in this work, Swift implements offers only single-node deployments and no graphical an API, which is different to the S3-API. For interaction user interface. on command line, the python client [33] for the Swift In order to deploy S3rver, the Node.js run-time API is a working solution. If the S3-API is mandatory, a environment and the package manager npm need to be middleware like swift3 [42] needs to be deployed. installed first. The installation of S3srver is done via the The Swift service offers two options for command npm install s3rver -g. Important authentication. The first option requires the OpenStack service related parameters like the port number and the Keystone service installed and the second option path of the S3 data folder are provided as command makes use of the TempAuth functionality, which is line parameters when starting the service. An unusual implemented in Swift. The Keystone option was characteristic of S3rver is, that the IP address also needs discarded, because a lightweight solution was wanted. to be specified as a command line parameter. If not The user credentials when using TempAuth are specified

5 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017 inside the file /etc/swift/proxy-server.conf. Table 2: Description of the HTTP methods with Further relevant configuration parameters, which are request-URIs that are used to interact with storage also specified inside this file, are among others the services logging level and the port number of the swift service. The deployment of Swift on the Raspberry Pi 3 Account-related Operations needed much effort because of the several software GET / List buckets dependencies and the complex configuration. The HEAD / Retrieve metadata installation of the latest revision 2.14.0 was possible, but Bucket-related Operations because of a software bug or a configuration issue, it was only possible to create buckets, but impossible to upload GET /bucket List objects objects into the service. Therefore, revision 2.2.0 of PUT /bucket Create bucket the Swift service was installed, because packages of this DELETE /bucket Delete bucket revision for the Raspbian operating system do already HEAD /bucket Retrieve metadata exist. Object-related Operations 3.10 Walrus GET /bucket/object Retrieve object PUT /bucket/object Upload object Walrus is a component of the IaaS solution Eucalyptus [16]. It is written in Java and licensed DELETE /bucket/object Delete object according to the GPLv3 license. The development is HEAD /bucket/object Retrieve metadata mainly driven by Hewlett Packard Enterprise (HPE), POST /bucket/object Update object since HP acquired the company Eucalyptus Systems in 2014. Of all private cloud service solutions, mentioned in this work, Walrus has the longest development period. compatible with the S3-API and the Swift API, have It is a part of Eucalyptus since revision 1.4 in 2009. only few dependencies, cause only little effort for the The developers support only the installation of installation and be simple to use. Eucalyptus inside the operating systems CentOS 7 One well-documented option to develop software, and Red Hat Enterprise Linux 7. Installations on which shall interact with AWS-compatible services, is different Linux distributions cause much effort because using the programming language Python together with of multiple dependencies. Walrus uses a folder inside the boto [8] library. In order to even simplify the a POSIX file system to store the buckets as folders and implementation, the command line tools s3cmd [35] the object data as files. The durability of the stored data and the Swift client [33] were selected to carry out depends entirely on the used storage system. all interaction with the used storage services. These A stand alone installation of Walrus out of the box is preconditions led to the development of a bash script. impossible because Walrus does not utilize the POSIX Once a storage service is registered in the file system for storing the metadata. These are stored in configuration file of s3cmd, S3perf can interact with the database, which is managed by the Cloud Controller its S3-compatible REST API. REST is an architectural- (CLC) – a different Eucalyptus component. style that relies on HTTP methods like GET or PUT. The service allows only single-node deployments. S3-compatible services use GET to receive the list of Instead of adding multi-node functionality into Walrus, buckets that are assigned to an user account, or a list the developers decided to support using Riak CS or of objects inside a bucket or an object itself. Buckets Ceph-RGW as object storage in Eucalyptus instead of and objects are created with PUT and DELETE is used Walrus. to erase buckets and objects. POST can be used to A stand-alone installation of Walrus is impossible upload objects and HEAD is used to retrieve meta-data and the installation of Eucalyptus was not successful on from an account, bucket or object. Uploading files into Raspbian GNU/Linux. Installation attempts with S3-compatible services is done via POST directly from 16.04 LTS failed the same. Therefore, for this work, the client of the user. Table 2 gives an overview of Walrus was not further evaluated. methods used to interact with S3-compatible storage services. [4] 4 DEVELOPMENTOF S3PERF If the Swift API shall be used for the communication with a storage service, S3perf does not use the command An analysis of the existing testing solutions (see line tool s3cmd, but the Swift client. In contrast to section 2) resulted in the development of a new tool, s3cmd, the swift tool uses no configuration file to called S3perf [5]. Its focus was to be lightweight, be store user credentials. The user of S3perf needs to

6 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

NetPIPE Bandwidth (logarithmic scaling of the x-axis and y-axis) NetPIPE Latencies (logarithmic scaling of the x-axis and y-axis)

100 1

10 0.1

1 0.01 Latency [s]

Throughput [Mbits/s] 0.1 0.001

NetPIPE using the end-to-end protocol TCP 0.01 0.0001 1 10 100 1000 10000 100000 1x106 1x107 1 10 100 1000 10000 100000 1x106 1x107 Message (payload) size [Bytes] Message (payload) size [Bytes]

NetPIPE Bandwidth (logarithmic scaling of the x-axis) NetPIPE Latencies (logarithmic scaling of the x-axis) 100 0.8

0.7 80 0.6

0.5 60

0.4

40 Latency [s] 0.3

Throughput [Mbits/s] 0.2 20 0.1

0 0 1 10 100 1000 10000 100000 1x106 1x107 1 10 100 1000 10000 100000 1x106 1x107 Message (payload) size [Bytes] Message (payload) size [Bytes]

Figure 1: Analysis of the network performance of the Raspberry Pi 3 by using the NetPIPE benchmark specify the endpoint of the used storage service, as 1. create a bucket well as username and password inside the environment variables ST_AUTH, ST_USER and ST_KEY. 2. Upload one or more objects into this bucket A common assumption, when using storage services, 3. Fetch the list of objects inside the bucket which implement the same API, is that all operations 4. Download the objects from the bucket cause identical results. But during the development of S3perf, some challenges caused by non-matching service 5. Erase the objects inside the bucket behavior emerged. One issue is the encoding of the bucket names. In order to be conforming to the DNS 6. Erase the bucket requirements, bucket names should not contain capital The time, which is required to carry out these letters. To comply with this rule, the services Minio, operations, is individually measured and can be used Riak CS, S3rver and Scality S3 do not accept bucket to analyze the performance of these commonly used names with capital letters. Other services like Nimbus storage service operations. Cumulus and S3ninja only accept bucket names, which Users of S3perf have the freedom to specify the are encoded entirely in capital letters. The service number of files via command-line parameters, which offerings Amazon S3 and Google Cloud, as well as the shall be created, uploaded and downloaded, as well as private cloud solutions Fake S3 and OpenStack Swift are their individual size. The files are created via the tool dd more generous in this case and accept bucket names, and contain pseudorandom data from /dev/random. which are written in lowercase and capital letters. In In order to be able to simulate different load scenarios, order to be compatible with the different storage service the S3perf tool supports the parallel transfer of objects, solutions and offerings, S3perf allows the user to specify as well as requesting delete operations in parallel. If the the encoding of the bucket names. parallel flag is set, the steps 2, 4 and 5 are executed in When doing a performance evaluation with S3perf, parallel by using the command line tool parallel. the tool executes these six steps for a specific number Every time when S3perf is executed, the tool prints of objects of a specific size: out a line of data, which informs the user about the date

7 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017 and time when the execution was finished, the number of created objects, the size of the single objects, as well as Required Time to create a Bucket (50 Tests) the required time in seconds to execute the steps 1-6. The 0.9 final column contains the sum of all time values, which is calculated by using the command line tool bc. The structure of the output simplifies the analysis of 0.8 the performance measurements by using tools like sed, awk and gnuplot. 0.7

5 PERFORMANCE ANALYSIS OF THE DEPLOYED CLOUD STORAGE SERVICES 0.6

To generate a sufficient number of measurement data, 0.5

S3perf was executed for each storage service five times [s] with ten objects of sizes 512 Byte, 1 kB, 2 kB, 4 kB, 8 kB, 16 kB, 32 kB, 64 kB, 128 kB, 256 kB, 512 kB, 1 MB, 2 MB, 4 kB and 8 MB. This caused up to 75 0.4 test runs for each storage service. Few test runs failed because the resources of the used single board computer were fully utilized or because the service did not allow 0.3 to upload files of a specific size. When analyzing the required time to upload or download files, it must be kept in mind that the 0.2 Raspberry Pi 3 is equipped with a 10/100 Mbit Ethernet controller, that is internally connected with the USB 0.1 hub. In practice, the network performance is less Swift than 100 Mbit. The network performance between two Minio S3rver Scality S3ninja Riak CS Fake S3 Fake Raspberry Pi 3 nodes was measured with the command- Cumulus line tool iperf v2.0.5 [21] and with the NetPIPE v3.7.2 benchmark [40]. According to iperf, the network Figure 2: Time to create a bucket inside the tested performance between the two nodes is approximately storage services 94 Mbit per second. A more detailed analysis of the network performance is possible with the NetPIPE benchmark. It tests the required time [s], does not mention any object sizes. The latency and throughput over a range of message sizes horizontal line inside each box is the median (second between two processes. The benchmark was executed quartile). The smallest 50% of the points are smaller between the two nodes by just using TCP as end-to-end than or equal to the median value. 25% of the points are (transport layer) protocol. The results in Figure 1 show smaller than or equal to the bottom box boundary (first that the smaller a message is, the more dominant the quartile) and 25% of the points are bigger than or equal transfer time by the communication layer overhead is. to the top box boundary (third quartile). For larger messages, the communication rate becomes bandwidth limited by a component in the communication The size of the box is also called interquartile range subsystem. Examples are the data rate of the network (IQR). It is calculated via (IQR = Q3 − Q1) and used link, the utilization of the transmission medium or a to find outliers in the data. The whiskers extend from specific device between the sender and the receiver like the ends of the box to the most distant point whose y- the used network switch. axis value has a maximum distant of 1.5 times the IQR. Figure 2 does not show all outliers. In few cases, creating a bucket with Riak CS and Swift required around 5 5.1 Step 1: Create a Bucket seconds. The creation of a single bucket (Step 1) cannot be Figure 2 shows that Riak CS and Swift require more executed in parallel and it is not influenced by the time to create a bucket compared with the other service size of the objects, which are (in later steps) created solutions. This is probably caused by their more complex and transferred by S3perf. Therefore, the boxplot in way to store the data with replicas, which consumes Figure 2, which presents the statistical distribution of the additional resources, but in principle also offers the

8 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

option to build systems with a better level of reliability. 5.4 Step 4: Download the Objects The required time to download ten objects sequentially 5.2 Step 2: Upload the Objects into the Bucket and in parallel into the tested storage services (Step 4) is presented in Figure 5. The measurements show that The required time to upload ten objects sequentially and the parallel download of the ten objects does only in few in parallel into the tested storage services (Step 2) is cases lead to a significant acceleration compared with the presented in Figure 3. The measurements show that the sequential upload. In most cases, the period to download parallel upload of the ten objects only in few cases lead to the objects in a parallel way is longer compared to the a significant acceleration compared with the sequential sequential download. The parallel upload is significant upload. In most cases, the period to upload the objects beneficial when using Cumulus with objects > 1 kB and in a parallel way is longer compared to the sequential < 2048 kB and S3ninja with objects > 16 kB. upload. The parallel upload is significant beneficial when using Fake S3 with objects > 512 kB, Minio with 5.5 Step 5: Erase the Objects objects > 1024 kB and < 8192 kB, Riak CS with objects The required time to erase ten objects sequentially and > 256 kB and S3ninja with objects > 32 kB. in parallel with the tested storage services (Step 5) is From the evaluated services, Riak CS is the most main presented in Figure 6. The measurements show that memory consuming one and it does not free the main erasing the ten objects in parallel does only in few memory after an upload operation. After the upload cases lead to a significant acceleration compared with of ten objects in parallel with a size of 2048 kB each, the sequential upload. In most cases, the period to only approximately 170 MB of main memory have been erase the objects in a parallel way is longer compared available on the Raspberry Pi 3. The attempt to upload to the sequential operation mode. Erasing in parallel is ten objects in parallel with a size of 4096 kB crashed beneficial when using S3ninja with objects > 8 kB. the node. When trying to upload ten objects in serial Erasing objects should not actually depend on the size with a size of 8192 kB, the main memory of the node of the objects, but as presented in Figure 6, when using gets entirely filled and also the Swap storage is used. In S3ninja, the size of the objects has a strong influence on effect, the node also crashed during this attempt. the required time. The root cause for this observation The upload of objects > 1024 kB into S3ninja failed may be the working method of the Java VM used. because nginx, which was used as a proxy, did not permit the upload of such large files. Another 5.6 Step 6: Erase the Bucket observation with S3ninja was, that the service consumed one CPU core entirely at times, during the upload of ten Erasing a single bucket (Step 6) cannot be executed in objects (in parallel and sequential operation mode). parallel and because the bucket is already empty, the Also Swift consumes a lot of main memory because performance of this operation is not be influenced by the of the way, the objects, buckets and accounts are stored size of any objects. Therefore, the boxplot in Figure 7, in rings with replicas. When deploying Swift on server which presents the statistical distribution of the required resources, the replicas are distributed over multiple time [s], does not mention any object sizes. physical storage devices. On the Raspberry Pi 3 node, The Figure does not show all outliers. In few cases, all replicas are stored on the microSD card, which is also erasing a bucket with Riak CS required around 8 seconds used to store the operating system. and with Swift around 5 seconds.

6 CONCLUSIONSAND NEXT STEPS 5.3 Step 3: Fetch the List of Objects The performance of single board computers like the Fetching a list of objects (Step 3) cannot be executed Raspberry Pi 3 cannot compete with higher-value in parallel and should not actually depend on the size systems because the characteristics of their relevant of the objects, but as presented in Figure 4, when using hardware components, especially the CPU, main S3ninja, the size of the objects has a strong influence on memory, storage and network interface. Regardless of the required time. The root cause for this behavior may the performance, single board computers are useful for be the working method of the Java VM used. academic purposes and research projects because of the It is also observed in this step that Riak CS and Swift lesser purchase costs and operating costs compared with provide a lesser performance compared with the other commodity hardware server resources. The hardware investigated solutions (except S3ninja) because Riak CS resources of a single Raspberry PI 3 node are sufficient and Swift both store the objects with replicas. to deploy and use each one of the S3-API-compatible

9 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017

Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Cumulus Cumulus Fake S3 Fake S3 10 10 10 10 8.5284 8.4842 18.6558 34.1192 8 8 8 8 9.1946 8.9376

6 6 6 6 4.9622 [s] [s] [s] [s] 4.687 4.5858 4.4948

4 4 4 4 3.1056 2.7556 2.7156 2.4506 2.2288 1.9124 1.9068 1.54 1.8236 1.6386 1.5142 1.4218

2 2 2 1.3936 2 1.3656 1.3404 1.2918 1.089 1.2298 1.2228 1.2184 1.2132 1.2052 1.2032 1.1946 1.1882 1.1618 1.1588 1.1546 1.004 1.0932 1.0786 1.0606 1.0584 1.0466 1.0408 1.0374 1.0368 0.885 0.839 0.8866 0.8436 0.8222 0.7676 0.7192 0.7142 0.7066 0.6988 0.6252 0.6022 0.5478

0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Minio Minio Riak CS Riak CS 10 10 10 10 20.5824 18.5634 39.3656 81.5172

8 8 10.1734 8 8 6.9568 9.3394 9.5658 6.4812 5.7326

6 6 6 5.4654 6 [s] [s] [s] [s] 4.8698 4.4644 4.4354 3.732 4 4 4 4 3.6994 3.162 3.2638 2.947 3.0706 2.7608 2.6816 2.5688 2.0102 1.9876 1.9576 1.9576 1.9256 1.8896 1.5824 1.4962 1.4582 2 2 1.289 2 2 1.3812 1.3678 1.3528 1.3518 1.3498 1.184 1.1626 1.1586 1.1292 0.981 1.1068 1.0134 1.0094 1.0058 0.9946 0.9214 0.704 0.7242 0.6988 0.6646 0.6516 0.4342 0.4032 Node crashed Node crashed Node crashed Node crashed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

S3ninja S3ninja S3rver S3rver 10 10 10 10 8.4678 12.6166 15.966 26.8608 8 8 8 8 6.7588 8.8096 9.3708

6 6 6 6 [s] [s] [s] [s] 4.8364 4.466 4.4622 4.3002 4 4 4 4 2.818 2.575 2.7042 2.5614 2.0448 2.0314 1.9434 1.8456 1.43 1.5768 1.406 1.4536 2 2 2 2 1.4334 1.225 1.2924 1.2546 1.2388 1.2162 1.2102 0.998 1.0882 1.0352 1.0028 1.0022 1.0004 0.9816 0.9818 0.9732 0.9296 0.8934 0.8704 0.8454 0.8422 0.581 0.5778 0.5274 0.308 0.278 0.4052 0.3462 0.3052 0.2886 0.2716 Operation failed Operation failed Operation failed Operation failed Operation failed Operation failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects Time to upload 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Scality Scality Swift Swift 10 10 10 10 8.028 10.6012 16.3142 10.8232 17.372 11.767 16.891 25.0816 19.5312 39.8106 8 8 7.8074 8 8 7.3298 7.177 7.0286 6.8402 6.7006 6.6906 6.6808 9.044 6.317 5.9468 5.512 5.5916 6 6 6 6 5.5426 [s] [s] [s] [s] 4.198 4.2034 4 4 4 4 3.1788 3.0752 3.0234 2.48 2.6372 2.6318 2.4914 2.4646 2.4516 2.4378 2.4348 2.3234 2.1206 2.0774 2.0518 1.7792 1.7432 1.6808 1.36 1.5294 1.359 1.4636 1.4044 2 1.4028 2 2 2 1.3416 1.3056 1.2802 1.2738 1.2674 1.086 1.1122 1.0762 1.0566 Operation failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Figure 3: Time to upload ten objects sequentially and in parallel into the tested storage services

10 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

Time to fetch a list of 10 Objects Time to fetch a list of 10 Objects Time to fetch a list of 10 Objects Time to fetch a list of 10 Objects (Mean Time of 5 Tests) (Mean Time of 5 Tests) (Mean Time of 5 Tests) (Mean Time of 5 Tests)

Cumulus Fake S3 Minio Riak CS 10 10 10 10

8 8 8 8

6 6 6 6 [s] [s] [s] [s]

4 4 4 4 3.2192

2 2 2 2 1.711 0.23 0.18 0.314 0.3904 0.229 0.223 0.3602 0.214 0.212 0.209 0.208 0.3512 0.207 0.207 0.184 0.182 0.177 0.176 0.3154 0.3142 0.3128 0.3064 0.3074 0.3058 0.3016 0.3028 0.2612 0.2538 0.2258 0.2246 0.2234 0.2218 0.2238 0.2174 0.2186 0.2182 0.2176 0.2146 0.2148 0.2154 0.2138 0.2152 0.2122 0.2066 0.2058 0.2056 0.2066 0.1852 0.1838 0.1804 0.1818 0.1776 0.1774 0.1776 0.1774 0.1756 0.1752 Node crashed during upload Attempt 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to fetch a list of 10 Objects Time to fetch a list of 10 Objects Time to fetch a list of 10 Objects Time to fetch a list of 10 Objects (Mean Time of 5 Tests) (Mean Time of 5 Tests) (Mean Time of 5 Tests) (Mean Time of 5 Tests)

S3ninja S3rver Scality Swift 10 10 10 10 11.527 23.337 8 8 8 8

6 5.7614 6 6 6 5.1184 [s] [s] [s] [s]

4 4 4 4 2.9552 1.9392 1.9324 1.8756 2 1.8702 2 2 2 1.2182 1.1268 0.925 0.7734 0.6846 0.6784 0.6792 0.6796 0.6802 0.6644 0.6662 0.6568 0.6518 0.6344 0.5366 0.4936 0.4204 0.239 0.3502 0.205 0.196 0.3296 0.3196 0.3022 0.2728 0.2696 0.2668 0.2652 0.2584 0.2538 0.2532 0.2494 0.2464 0.2464 0.2442 0.2426 0.2318 0.2226 0.2202 0.2146 0.2062 0.2064 0.2036 0.1972 0.1956 0.1956 0.1924 0.1874 No Object list to fetch because Upload failed No Object list to fetch because Upload failed No Object list to fetch because Upload failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Figure 4: Time to fetch the list of objects (more detailed view) storage service solutions, which have been examined in objects of 1 MB and more, the maximum throughput of this work. the network interface of the Raspberry Pi 3 becomes a limiting factor. Furthermore, it has been observed that In order to investigate the performance of the analyzed when using S3ninja, the performance of list and delete storage services, a set of typical operations has been operations depend on the size of the objects. picked out and a lightweight tool, called S3perf has been developed and implemented. This tool was used to carry out the set of operations and measure the required time. Next steps are the deployment of Ceph-RGW, Based on these measurements, the performance of the Minio, Riak CS and Swift inside multi-node systems storage services on a single board computer has been of Raspberry Pi 3 nodes and an analysis of their analyzed. performance and robustness. Results are among others that the service solutions Cumulus, Fake S3, Minio, S3ninja, S3rver and Scality For future work, it is also useful to analyze the total for most operations provide a better performance cost of ownership (TCO) of multi-node systems of single compared with Riak CS and Swift. The reason for board computers which host storage services. As proven this observation is probably caused by the way Riak among others by Zhao et al. [48], the TCO of single CS and Swift store the objects and bucket data with board computer clusters can be better compared with replicas and organize them via Distributed Hash Tables. higher-value systems. Riak CS and Swift both focus to work efficiently inside multi-node deployments and are not in first place designed to provide a strong performance inside single Next steps also comprise an analysis of the maximum node scenarios. Another observation was that the workload and storage capacity of the described systems parallel execution of multiple upload, download and in order to archive more information about the practical erase operations just in few scenarios are beneficial usability of storage services which are deployed on for the overall performance. Especially, when using single board computer clusters.

11 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017

Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Cumulus Cumulus Fake S3 Fake S3 10 10 10 10 8.911 8.9704 16.3078 7.7872 8 8 8 8 7.7172 7.2732 9.8292 9.7894 9.5152 9.4556 9.5312 9.7012 5.9516 5.9144 5.745

6 6 5.5762 6 6 4.95 [s] [s] [s] [s] 4.6094 4.4036 4 4 4 4 3.084 3.0848 2.5006 2.3452 2.1858 1.7528 1.7228 1.371 1.325

2 2 2 2 1.4216 1.3418 1.3046 1.125 1.2138 1.1872 1.038 1.032 1.1272 1.1258 1.1192 1.1096 1.0866 1.0784 1.0732 1.0622 1.0616 1.0586 1.0592 1.0564 1.0424 1.0388 0.9272 0.8748 0.7558 0.476 0.5218 0.4864 0.4822 0.4788 0.4498 0.4186 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Minio Minio Riak CS Riak CS 10 10 10 10 8.3226 7.976

8 8 7.4864 8 8

6 6 6 6 [s] [s] [s] [s] 4.6882 4.4522 3.897 4 4 4 4 3.3042 3.1888 2.6048 2.1098 1.7348 1.2 1.5612

2 2 2 2 1.4402 1.3528 1.148 1.141 1.2782 1.114 1.2112 1.1608 1.1334 0.959 1.0814 1.0814 1.0752 1.0726 1.0348 1.0124 0.9654 0.9626 0.9618 0.9594 0.9496 0.9492 0.9358 0.769 0.742 0.8752 0.8202 0.7926 0.7782 0.7544 0.7284 0.7206 0.7178 0.479 0.26 0.272 0.3548 0.2902 0.2704 0.2638 0.2646 0.2596 0.2542 Node crashed during upload Attempt Node crashed during upload Attempt Node crashed during upload Attempt Node crashed during upload Attempt 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

S3ninja S3ninja S3rver S3rver 10 10 10 10 11.69 14.929 27.0324 52.3598 7.9892 7.8892

8 8 8 8 7.4496 6.6286

6 6 6 6 [s] [s] [s] [s] 4.7354 4.3066 3.9164 4 4 4 4 3.9056 3.3122 2.7108 2.4648 2.0654 1.9904 1.675 1.7528 1.7514 1.5668

2 1.261 2 2 2 1.244 1.3314 1.3286 1.3096 1.2492 1.2392 1.2098 1.2074 1.2012 1.046 1.1444 1.1322 1.0542 1.0088 0.9976 0.9964 0.9962 0.9666 0.9678 0.9634 0.9496 0.748 0.7194 0.6584 0.5396 0.3726 0.3086 0.2936 0.2808 0.2798 0.2764 0.2736 No Object to download because Upload failed No Object to download because Upload failed No Object to download because Upload failed No Object to download because Upload failed No Object to download because Upload failed No Object to download because Upload failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects Time to download 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Scality Scality Swift Swift 10 10 10 10 8.0896 7.9926 7.795 8 8 8 8 5.9224 6 6 6 6 [s] [s] [s] [s] 4.7916 4.4928 4.2474 4.1734 4.0892 3.9118 3.7718 3.43 3.4826 3.4742 3.4522 4 4 4 3.4272 4 3.3552 3.3526 3.3094 3.3016 2.721 2.4612 2.3978 1.796 1.6834

2 2 1.4486 2 2 1.3842 1.3786 0.9 1.2548 1.095 1.059 1.1858 1.1276 1.1128 1.1096 1.1056 1.0964 0.935 1.0716 1.0672 0.916 1.0492 0.887 1.0042 0.9328 0.9318 0.9122 0.9042 0.9058 0.9014 0.607 0.6936 0.6708 0.6634 0.6466 0.5952 0.5946 0.5398 No Object to download because Upload failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Figure 5: Time to download ten objects sequentially and in parallel into the tested storage services

12 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Cumulus Cumulus Fake S3 Fake S3 10 10 10 10

8 8 8 8

6 6 6 6 [s] [s] [s] [s]

4 4 4 4 1.49 1.5486 1.5302 1.5252 1.5156 1.358

2 1.287 2 2 2 1.148 0.99 1.2756 1.2328 1.1924 1.1842 1.036 1.1654 1.1456 1.1298 1.1272 1.1192 1.0968 0.9806 0.9832 0.9812 0.9778 0.9758 0.9734 0.9726 0.9722 0.9702 0.9702 0.9468 0.8554 0.8326 0.684 0.8184 0.7878 0.7814 0.7692 0.7558 0.7426 0.7398 0.7268 0.6976 0.6944 0.6828 0.6792 0.486 0.5972 0.438 0.427 0.5474 0.4752 0.4486 0.4418 0.4154 0.4116 0.4018 0.4004 0.3982 0.3844 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Minio Minio Riak CS Riak CS 10 10 10 10 11.7516

8 8 8 10.303 8 10.0118 6.3422 6 6 6 6 5.1514 [s] [s] [s] [s] 4.6728 3.354 4 4 4 4 3.4612 3.0272 2.7662 2.7496 2.5134 2.4116 2.4002 2.217 2.0886 1.922 2.0608 2.0334 1.779 1.741 1.558 1.6492 1.6464 1.5708 1.5298 2 2 2 1.334 2 1.3264 0.96 0.982 1.1216 0.965 0.949 0.945 0.9638 0.9618 0.9498 0.9478 0.9438 0.9458 0.9436 0.9392 0.6476 0.5466 0.244 0.3064 0.2538 0.2452 0.2384 0.2384 0.2356 0.2348 0.2334 0.2338 0.2302 0.2322 Node crashed during upload Attempt Node crashed during upload Attempt Node crashed during upload Attempt Node crashed during upload Attempt 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

S3ninja S3ninja S3rver S3rver 10 10 10 10 12.5174 24.1758 8 8 8 8 7.0538

6 6 6 6 [s] [s] [s] [s] 3.822 4 4 4 4 2.2448 1.5722

2 2 1.4776 2 2 1.3322 1.118 1.111 1.2536 1.101 1.1406 0.978 1.1128 1.1092 0.963 1.1024 1.1036 1.0964 0.954 1.0878 0.946 1.0716 1.0026 0.9952 0.9698 0.9634 0.9636 0.9572 0.9554 0.9518 0.9482 0.9428 0.9372 0.669 0.8102 0.639 0.7322 0.29 0.421 0.367 0.279 0.269 0.252 0.3672 0.2786 0.2706 0.2646 0.2554 0.2474 0.2482 0.2494 0.2448 No Object to erase because Upload failed No Object to erase because Upload failed No Object to erase because Upload failed No Object to erase because Upload failed No Object to erase because Upload failed No Object to erase because Upload failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects Time to erase 10 Objects in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests) in parallel (Mean Time of 5 Tests) sequentially (Mean Time of 5 Tests)

Scality Scality Swift Swift 10 10 10 10

8 8 8 8 7.0802 6.508

6 6 6 6 5.0578 [s] [s] [s] [s] 4.8502 4.699 4.51 4.5958 4.5302 4.4512 4.4306 4.148 4.1988 4.002 3.8924 4 4 4 4 3.7162 3.1844 3.1212 2.283 2.141 2.2542 1.932 2.0738 1.389

2 2 1.4034 2 2 1.4048 1.3838 1.3738 1.3716 1.3702 1.3622 1.3526 1.203 1.3292 1.183 1.2958 1.2428 1.2118 1.1836 1.1854 1.1642 1.1596 1.1366 1.1338 1.1272 1.1176 1.0732 0.9914 0.798 0.9076 0.727 0.8694 0.721 0.8564 0.8272 0.7678 0.7622 0.7504 0.7444 0.7166 No Object to erase because Upload failed 0 0 0 0 1 2 4 8 1 2 4 8 1 2 4 8 1 2 4 8 16 32 64 16 32 64 16 32 64 16 32 64 0.5 0.5 0.5 0.5 128 256 512 128 256 512 128 256 512 128 256 512 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 1024 2048 4096 8192 Object Size [kB] Object Size [kB] Object Size [kB] Object Size [kB]

Figure 6: Time to erase ten objects sequentially and in parallel into the tested storage services

13 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017

[3] C. Baun, “Performance and energy-efficiency Required Time to erase a Bucket (50 Tests) aspects of clusters of single board computers,” 0.9 International Journal of Distributed and Parallel Systems, vol. 7, no. 2/3/4, 2016. [4] C. Baun, M. Kunze, D. Schwab, and T. Kurze, 0.8 “Octopus-a redundant array of independent services (rais).” in CLOSER 2013: Proceedings

0.7 of the 3rd International Conference on Cloud Computing and Services Science. SCITEPRESS, 2013, pp. 321–328. 0.6 [5] C. Baun and R.-M. Spanou, “s3-perf source code,” https://github.com/christianbaun/s3perf, accessed 12th August 2017. 0.5 [s] [6] Z. Bjornson, “Aws s3 vs google cloud vs azure: Cloud storage performance,” 0.4 http://blog.zachbjornson.com/2015/12/29/ cloud-storage-performance.html, accessed 12th August 2017. 0.3 [7] F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, “Fog computing and its role in the internet of Proceedings of the first edition of 0.2 things,” in the MCC workshop on Mobile cloud computing. ACM, 2012, pp. 13–16. 0.1 [8] Boto, “Boto source code and documentation,” https://github.com/boto/boto, accessed 12th Swift Minio S3rver Scality S3ninja Riak CS Fake S3 Fake August 2017. Cumulus [9] J. Bresnahan, K. Keahey, D. LaBissoniere, and T. Freeman, “Cumulus: An Open Source Storage Figure 7: Time to erase the bucket Cloud for Science,” in Proceedings of the 2nd international workshop on Scientific cloud computing. ACM, 2011, pp. 25–32. ACKNOWLEDGEMENTS [10] N. Canceill, W. de Jong, and J. Saathof, “A study of This work is funded by the Hessian Ministry of scalable storage systems,” SURF, Tech. Rep., 2014. Higher Education, Research, and the Arts (’Hessisches [11] Ceph, “Ceph Object Gateway,” http://docs.ceph. Ministerium fur¨ Wissenschaft und Kunst’) in the com/docs/master/radosgw/, accessed 12th August framework of research for practice (’Forschung fur¨ die 2017. Praxis’). [12] COSBench, “Cloud object storage benchmark – Many thanks to Katrin Baun for her assistance in source code and documentation,” https://github. improving the quality of this paper. com/intel-cloud/cosbench, accessed 12th August 2017. REFERENCES [13] J. Coyle, “Create a 3 node ceph storage cluster,” https://www.jamescoyle.net/how-to/ [1] B. Apperson, “Building a ceph cluster on 1244-create-a-3-node-ceph-storage-cluster, raspberry pi,” http://bryanapperson.com/blog/ accessed 12th August 2017. the-definitive-guide-ceph-cluster-on-raspberry-pi/, [14] J. Coyle, “Small scale ceph replicated accessed 12th August 2017. storage,” https://www.jamescoyle.net/how-to/ [2] C. Baun, “Mobile clusters of single board 2105-small-scale-ceph-replicated-storage, computers: an option for providing resources to accessed 12th August 2017. student projects and researchers,” SpringerPlus, [15] G. DeCandia, D. Hastorun, M. Jampani, vol. 5, no. 1, 2016. G. Kakulapati, A. Lakshman, A. Pilchin,

14 C. Baun, H.-N. Cocos, R.-M. Spanou: Performance Aspects of Object-based Storage Services on Single Board Computers

S. Sivasubramanian, P. Vosshall, and W. Vogels, [27] R. McFarland, “s3-perf source code,” https:// “Dynamo: amazon’s highly available key-value github.com/ross/s3-perf, accessed 12th August store,” ACM SIGOPS operating systems review, 2017. vol. 41, no. 6, pp. 205–220, 2007. [28] Minio, “Minio source code and documentation,” [16] Eucalyptus, “Eucalyptus source code and https://github.com/minio/minio, accessed 12th documentation,” https://github.com/eucalyptus/ August 2017. eucalyptus, accessed 12th August 2017. [29] A. Nayak, A. Poriya, and D. Poojary, “Type [17] Fake S3, “Fake S3 source code and of databases and its comparison with documentation,” https://github.com/jubos/fake-s3, relational databases,” International Journal of accessed 12th August 2017. Applied Information Systems, vol. 5, no. 4, pp. 16– 19, 2013. [18] S. Garfinkel, “An evaluation of amazon’s grid [30] Nimbus, “Nimbus source code and computing services: Ec2, s3, and sqs,” Harvard documentation,” https://github.com/nimbusproject/ Computer Science Group Technical Report TR-08- nimbus, accessed 12th August 2017. 07, 2007. [31] OpenStack Storage (Swift), “Swift source code Building the web of [19] D. Guinard and V. Trifa, and documentation,” https://github.com/openstack/ things: with examples in node. js and raspberry pi . swift, accessed 12th August 2017. Manning Publications Co., 2016. [32] M. R. Palankar, A. Iamnitchi, M. Ripeanu, and [20] HPL, “A Portable Implementation of the S. Garfinkel, “Amazon s3 for science grids: a viable High-Performance Linpack Benchmark solution?” in Proceedings of the 2008 international for Distributed-Memory Computers,” workshop on Data-aware distributed computing. http://www.netlib.org/benchmark/hpl/, accessed ACM, 2008, pp. 55–64. 12th August 2017. [33] Python client for the Swift API, “Source code [21] Iperf, “Iperf source code and documentation,” http: and documentation,” https://github.com/openstack/ //www.nwlab.net/art/iperf/, accessed 12th August python-swiftclient, accessed 12th August 2017. 2017. [34] Riak CS, “Riak CS source code and [22] P. Jain, A. Goel, and S. Gupta, “Monitoring of documentation,” https://github.com/basho/riak cs, riak cs storage infrastructure,” Procedia Computer accessed 12th August 2017. Science, vol. 54, pp. 137–146, 2015. [35] s3cmd, “s3cmd source code and documentation,” [23] D. Karger, E. Lehman, T. Leighton, R. Panigrahy, https://github.com/s3tools/s3cmd, accessed 12th M. Levine, and D. Lewin, “Consistent hashing August 2017. and random trees: Distributed caching protocols [36] S3ninja, “S3ninja source code and documentation,” for relieving hot spots on the world wide web,” https://github.com/scireum/s3ninja, accessed 12th in Proceedings of the twenty-ninth annual ACM August 2017. symposium on Theory of computing. ACM, 1997, [37] S3rver, “S3rver source code and documentation,” pp. 654–663. https://github.com/jamhall/s3rver, accessed 12th [24] K. Keahey, M. Tsugawa, A. Matsunaga, and August 2017. J. Fortes, “Sky computing,” Internet Computing, [38] Scality S3 Server, “S3 Server source code IEEE, vol. 13, no. 5, pp. 43–51, Sept 2009. and documentation,” https://github.com/scality/S3, [25] Y. N. Krishnan, C. N. Bhagwat, and A. P. Utpat, accessed 12th August 2017. “Fog computingnetwork based cloud computing,” [39] K. Skala, D. Davidovic, E. Afgan, I. Sovic, in Electronics and Communication Systems and Z. Sojat, “Scalable distributed computing (ICECS), 2015 2nd International Conference on. hierarchy: Cloud, fog and dew computing,” IEEE, 2015, pp. 250–251. Open Journal of Cloud Computing (OJCC), [26] L. Land, “Real-world benchmarking vol. 2, no. 1, pp. 16–24, 2015. [Online]. of cloud storage providers: Amazon Available: http://nbn-resolving.de/urn:nbn:de:101: s3, google cloud storage, and azure 1-201705194519 blob storage,” https://lg.io/2015/10/25/ [40] Q. O. Snell, A. R. Mikler, and J. L. Gustafson, real-world-benchmarking-of-s3-azure-google-\ “Netpipe: A network protocol independent cloud-storage.html, accessed 12th August 2017. performance evaluator,” in IASTED International

15 Open Journal of Cloud Computing (OJCC), Volume 4, Issue 1, 2017

Conference on Intelligent Information benchmark,” in Proceedings of the 4th ACM/SPEC Management and Systems, vol. 6. USA, International Conference on Performance 1996. Engineering. ACM, 2013, pp. 199–210. [41] Stanchion, “Stanchion source code and documentation,” https://github.com/basho/ AUTHOR BIOGRAPHIES stanchion, accessed 12th August 2017. Dr. Christian Baun is a [42] Swift3, “Swift3 middleware for openstack swift - Professor at the Faculty source code and documentation.” of Computer Science and [43] N. Tiwari, “Enterprise-grade cloud Engineering of the Frankfurt storage with nginx plus and University of Applied Sciences minio,” https://www.nginx.com/blog/ in Frankfurt am Main, Germany. enterprise-grade-cloud-storage-nginx-plus-minio/, He earned his Diploma degree in accessed 24th June 2017. Informatik (Computer Science) [44] R. Vijayan, “Ceph on a raspberry in 2005 and his Master degree pi,” https://www.linkedin.com/pulse/ in 2006 from the Mannheim ceph-raspberry-pi-rahul-vijayan, accessed 12th University of Applied Sciences. August 2017. In 2011, he earned his Doctor degree from the University of Hamburg. He is author of several books, articles and [45] Y. Wang, “Definition and categorization of dew research papers. His research interest includes operating computing,” Open Journal of Cloud Computing systems, distributed systems and computer networks. (OJCC), vol. 3, no. 1, pp. 1–7, 2016. [Online]. Available: http://nbn-resolving.de/urn:nbn:de:101: Henry-Norbert Cocos studies 1-201705194546 computer science at the [46] S. A. Weil, S. A. Brandt, E. L. Miller, D. D. Frankfurt University of Applied Long, and C. Maltzahn, “Ceph: A scalable, Sciences. His research interest high-performance distributed file system,” in includes distributed systems Proceedings of the 7th symposium on Operating and single board computers. systems design and implementation. USENIX Currently, he constructs a 256 Association, 2006, pp. 307–320. node cluster of Raspberry Pi 3 nodes which shall be used [47] C. W. Zhao, J. Jegatheesan, and S. C. Loon, to analyze different parallel “Exploring iot application using raspberry pi,” computation tasks. For this International Journal of Computer Networks and work, he analyzes which Applications, vol. 2, no. 1, pp. 27–34, 2015. administration tasks need to be carried out during the [48] Y. Zhao, S. Li, S. Hu, H. Wang, S. Yao, deployment and operation phase and how these tasks H. Shao, and T. Abdelzaher, “An experimental can be automated. evaluation of datacenter workloads on low-power Rosa-Maria Spanou studies embedded micro servers,” Proceedings of the computer science at the VLDB Endowment, vol. 9, no. 9, pp. 696–707, Frankfurt University of Applied 2016. Sciences. Her research interest [49] Q. Zheng, H. Chen, Y. Wang, J. Duan, and includes distributed systems Z. Huang, “Cosbench: A benchmark tool for cloud and single board computers. object storage services,” in Cloud Computing Currently, she constructs and (CLOUD), 2012 IEEE 5th International analyzes different multi-node Conference on. IEEE, 2012, pp. 998–999. object-based cloud storage [50] Q. Zheng, H. Chen, Y. Wang, J. Zhang, and solutions. J. Duan, “Cosbench: cloud object storage

16