Performance Analysis of Object Store Systems in a Fog/Edge Computing Infrastructures Bastien Confais, Adrien Lebre, Benoît Parrein

Performance Analysis of Object Store Systems in a Fog/Edge Computing Infrastructures Bastien Confais, Adrien Lebre, Benoît Parrein To cite this version: Bastien Confais, Adrien Lebre, Benoît Parrein. Performance Analysis of Object Store Systems in a Fog/Edge Computing Infrastructures. CloudCom, Dec 2016, Luxembourg, Luxembourg. hal- 01397686 HAL Id: hal-01397686 https://hal.archives-ouvertes.fr/hal-01397686 Submitted on 16 Nov 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Performance Analysis of Object Store Systems in a Fog/Edge Computing Infrastructures Bastien Confais∗, Adrien Lebrey, Benoˆıt Parreinz ∗CNRS, IRCCyN UMR 6597, 1, rue de la Noe,¨ BP 92101, 44321 Nantes Cedex 3, France yINRIA, LINA UMR 6241, Mines de Nantes, 4 Rue Alfred Kastler, 44300 Nantes, France zUniversite´ de Nantes, IRCCyN UMR 6597, Polytech Nantes, rue Christian Pauc, BP 50609, 44306 Nantes Cedex 3, France Abstract—Fog/Edge computing infrastructures have been pro- address such a question for a cloud storage service. Concretely, posed as an alternative of current Cloud Computing facilities to we discuss an empirical analysis of three storage systems with address the latency issue that prevent the development of several the ultimate goal of delivering a system such as the Simple applications. The main idea is to deploy smaller data-centers at the edge of the backbone in order to bring cloud computing Storage Service (S3) of Amazon. S3 is one of the most used resources closer to the end-usages. While couple of works services offered by Amazon and a building block for hundred illustrated the advantages of such infrastructures in particular for of cloud services. We believe that providing such a storage IoT applications, the way of designing elementary services that service for Fog/Edge Computing infrastructures can pave the can take advantage of such massively distributed infrastructures way toward new services as well as IoT applications. has not been yet discussed. In this paper, we propose to deal with such a question from the storage point of view. First, we propose The three storage systems we studied are Rados [5], the ob- a list of properties a storage system should meet in this context. ject storage module of the Ceph project, Cassandra [6], a high Second, we evaluate through performance analysis three “off- performance key value store and InterPlanetary File System the-shelf” object store solutions, namely Rados, Cassandra and (IPFS) [7], a storage system which uses the concepts brought InterPlanetary File System (IPFS). In particular, we focused (i) on by BitTorrent. We selected these three systems because they access times to push and get objects under different scenarios and (ii) on the amount of network traffic that is exchanged between propose software abstractions for the definition of geographi- the different sites during such operations. Experiments have been cal areas without leveraging on a central server. We claim that conducted using the Yahoo Cloud System Benchmark (YCSB) on such software-defined areas are important for the Fog/Edge top of the Grid’5000 testbed. We show that among the three tested Computing because it enables the storage system to consider solutions IPFS fills most of the criteria expected for a Fog/Edge the topology of the infrastructure (i.e., one software area computing infrastructure. corresponds to one particular micro DC, that is one Fog/Edge site). Considering the topology enables the development of I. INTRODUCTION advanced strategies that can favor local accesses with the goal The advent of smartphones, tablets as well as IoT devices of mitigating the traffic that is exchanged between each site as revolutionized the ways people are consuming IT services. well as the impact each site may have on the others. Moreover, While lots of applications take the advantage of the Internet each of the selected systems relies on a different software and Cloud Computing solutions to extend devices’ capabilities architecture as well as communication protocols that allow us in terms of computations as well as storage, reaching data to see the pros and cons of the different development choices centers (DCs) operated by giant actors such as Amazon, in a Fog/Edge Computing context. Google and Microsoft implies significant penalties in terms The main contribution of our work is a deep performance of network latency, preventing a large amount of services to evaluation enabling the identification of pros/cons of the be deployed [1]. The Fog Computing paradigm [2] has been strategies implemented by those three systems. Experiment proposed to overcome such a limitation: dedicated severs are have been done on top of Grid’5000 [8] by considering deployed in micro/nano DCs geographically spread at the edge several data manipulation scenarios leveraging Yahoo Cloud of the network so that it becomes possible to execute latency Service Benchmark (YCSB) [9], a well-known benchmark tool dependent applications as close as possible to the end-usages particularly designed to benchmark object stores [10], [11]. and keep non sensitive ones in traditional Cloud DCs. The remaining of the paper is as follows. Section II defines Also known as Edge Computing, the advantages of of such the Fog/Edge computing model we consider and gives a list of infrastructures have been described through several scenarios characteristics a S3-like service should have in such a context. [3], [4]. However all these studies focused on particular Section III presents an overview of the three storage systems. use-cases and do not investigate whether generic Fog/Edge Evaluations are discussed in Section IV. Finally, Section V services can be envisioned. In this paper, we propose to concludes this study and highlights some perspectives. Fig. 1. Overview of a Cloud, Fog and Edge platform. II. FOG/EDGE COMPUTING MODEL • low access time (by favoring data locality); In this section, we present the Fog/Edge architecture we are • network containment between sites; considering and a list of characteristics we claim an object • availability of data in case of service partitionning; storage system should have in such a context. • support for users mobility. Low access time is the main characteristic behind the A. Fog/Edge Computing: a hierarchy of resources motivation for the Fog paradigm. The idea is to favor local Industrials as well as academics argue in favor of a new accesses each time it is possible (i.e., each put into the object model of distributed computing composed of IT resources store should be handled by the closest site, assuming that each spread from the Cloud to the Extreme Edge. Such an in- fog site delivers the same storage performance). frastructure follows a hierarchical topology from the point of Network containment is the idea that an action on one site views of distance and power capabilities: Cloud facilities are does not impact the other sites negatively. In other words, if the farest elements in terms of network latencies but the ones one site faces a peak of activity, the performance of other that provide the largest computing and storage capabilities. sites should not change. We believe that such a criteria can be Edge/Extreme Edge devices can benefit from local computing delivered by mitigating data transfers between sites each time and storage resources but those resources are limited in an operation is performed. comparison to the Cloud ones. Finally, Fog sites can be seen The third feature is related to the partitioning of the storage as intermediate facilities that offer a tradeoff between distance service that can occur each time one site is disconnected from and power capabilities of IT resources [1], [12]. Moreover, Fog the other ones. While replication strategies between sites can sites can complement each other to satisfy the needs between ensure data availability, we claim that a Fog storage service user’s devices and Cloud Computing centers [3], [13]. Figure 1 should be able to run, at least, in a degraded mode in order to illustrates such a description. The Fog platform is composed tolerate local accesses and provide appropriate mechanisms to of a significant number of sites that can be geographically reconsolidate the service once the disconnection is completed. spread over a large area. Each site hosts a limited number Mobility support is the last property we have identified. The of servers that offer storage and computing capabilities. End- idea is to enable data to transparently follow its usages. To users devices (smartphones, tablets, laptops) as well as IoT illustrate such a feature, you can imagine a user moving from devices can reach a Fog site with a rather low latency. We one radio base station to another one. In such a situation, the consider that the latency between Fog sites (noted Lcore) is storage service should be able to relocate sollicited data in a up to 50 ms (mean latency of a Wide Area Network link) and transparent manner from the previous site to the new one. Such the latencies between users and their site (noted Lfog) is less a transparent relocation of data will mitigate remote acesses. than 10 ms (latency of a wireless link). The latency to reach a Last but not the least, we underline that discussing conis- Cloud platform (noted Lcloud) from the clients is important in tency model of the storage service we target is behind-the- comparison to other durations (about 200 ms) [14], [15] and scope of this first study.

Load more