SLA-Aware Adaptive Mapping Scheme in Bigdata Distributed Storage Systems

SMA 2020, September 17-19, Jeju, Republic of Korea, S. Chum et al. SLA-Aware Adaptive Mapping Scheme in Bigdata Distributed Storage Systems Sopanhapich Chum Jerry Li Department of Computer Science Memory Solutions Lab. Dankook University Samsung Semiconductor Inc. Yongin, Korea San Jose, CA, USA [email protected] [email protected] Heekwon Park Jongmoo Choi Memory Solutions Lab. Department of Computer Science Samsung Semiconductor Inc. Dankook University San Jose, CA, USA Yongin, Korea [email protected] [email protected] ABSTRACT by 2025, with a compounded annual growth rate of 61% [23]. In As data are processed by diverse clients ranging from urgent time- addition, these data are processed instantly to extract information critical to best-effort, supporting different QoS (Quality of Service) for decision making, recommendation and autonomous control becomes a vital component in a distributed storage system. In this using various analytic models and frameworks [1, 8, 11, 20, 27, 30]. paper, we propose a novel SLA (Service Level Agreement)-aware We indeed live in Bigdata era [24]. adaptive mapping scheme that can differentiate between urgent Distributed storage (also called as Cloud storage) plays a key and normal clients based on their I/O requirements. The scheme ba- role in Bigdata era. There are various distributed storage systems sically divides storage into two regions, normal and urgent, which including GFS [14], HDFS [26], Ceph [2, 33], Azure Storage [16], makes it feasible to isolate urgent clients from normal ones. In Amazon S3 [22], Openstack Swift [18], Haystack [6], Lustre [35], addition, it changes the size of the isolated region in an adaptive GlusterFS [17] and so on. The typical architecture of distributed manner so that it can provide better performance to normal clients. storage systems consists of three components: client nodes, stor- Besides, we devise two techniques, called logical cluster and nor- age nodes and master nodes. A data object generated by a client mal inclusion, to prevent interference caused by adaptability. We is mapped into storage nodes, where master nodes take care of implement our scheme using the CRUSH (Controlled Replication placement, replication and recovery. Under Scalable Hashing) mapping algorithm in Ceph, a well-known As distributed storage systems are utilized by diverse clients, distributed storage system. Evaluation results demonstrate that our it becomes indispensable to provide QoS (Quality of Service) in proposal can comply with SLA requirements while providing better accordance with different requirements of clients. For instance, a performance than the fixed mapping scheme. traffic monitoring client needs to analyze road congestion data until a pre-specified time. A stock trading client who needs to CCS CONCEPTS inspect previous transaction data before the next trading time is another example of time-critical service. On the contrary, batch • Information systems → Information storage systems; Dis- processing and data backup are non time-critical services that can tributed storage; Hierarchical storage management; • Software be conducted in a best-effort way, without interfering with urgent and its engineering → Software system structures. clients. Such different requirements are contracted as SLA (Service KEYWORDS Level Agreement) in terms of performance, availability and cost between clients and storage providers [3, 28, 29, 36]. Bigdata, Cloud storage, Service Level Agreement, Performance In this paper, we propose a novel SLA-aware adaptive mapping scheme for distributed storage systems. It makes use of mapping 1 INTRODUCTION between clients and storage nodes. Basically, it classifies clients These days, data are generated from various sources including into two types: normal and urgent clients. Also, it divides storage on-line transactions, SNSs (Social Network Services), autonomous nodes into two regions, one for normal clients and the other for vehicles and smart infrastructures [7]. According to IDC article, urgent ones. The number of storage nodes for the urgent region world data will grow from 33 ZB (ZettaByte) in 2018 to 175 ZB is determined by SLA. Then, a data object requested by a normal client is mapped into the normal region while that requested by Permission to make digital or hard copies of part or all of this work for personal or an urgent client mapped into the urgent region. This isolation can classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation prevent urgent clients from being interfered by normal clients. on the first page. Copyrights for third-party components of this work must be honored. However, this fixed isolation suffers from performance degrada- For all other uses, contact the owner/author(s). tion due to the low storage bandwidth utilization especially when SMA 2020, September 17-19, Jeju, Republic of Korea, © 2020 Copyright held by the owner/author(s). the active number of urgent clients becomes small. To overcome SLA-Aware Adaptive Mapping Scheme in Bigdata Distributed Storage Systems SMA 2020, September 17-19, Jeju, Republic of Korea, this problem, we design our scheme to work adaptively so that it can adjust the urgent region size according to the activity of urgent clients. When the activity of urgent clients becomes low, it reduces the urgent resign, which eventually allows to boost the performance of normal clients. By the way, this adaptability raises two issues. One issue is about mapping. In our adaptive scheme, a data object generated by a normal user can be mapped either the normal region or the urgent Figure 1: Logical structure of Ceph. region adaptively. To tackle this situation, we devise a technique, called logical cluster, that abstracts storage nodes differently based on different cluster maps, which is further discussed in Section 3. The second core component is the CRUSH (Controlled Replica- The second issue is about SLA guarantee. Once a data object of tion Under Scalable Hashing) algorithm [34]. CRUSH is responsible a normal client is written into the urgent region, the client reads for data placement, that is the mapping between a data object and that data from the urgent region even though the activity of urgent an OSD. Figure 2 illustrates how CRUSH works. When a file is clients becomes high, which may cause SLA violation. To miti- created by a client, it is split into multiple data objects according gate this problem, we suggest a technique, called normal inclusion, to the object size configuration. Then, each object is assigned into which assures that at least one replica of a data object is stored a pool using a simple hash function with the >1 942C83 as a hash in the normal region. The cooperation of logical cluster and nor- key. A pool can be considered as a logical partition. Within a pool, mal inclusion can prevent normal clients from interrupting urgent objects are sharded among units, called PGs (Placement Groups). clients while not hurting the normal client performance, We implement our proposal by modifying the CRUSH (Con- trolled Replication Under Scalable Hashing) algorithm in Ceph [34]. Ceph is a widely-used open-source based distributed storage system, deployed various cloud platforms including Hadoop, AWS and Openstack [2]. Evaluation results show that our scheme can improve performance by up to 18% while guaranteeing SLA required by urgent clients. The rest of this paper is organized as follows. Section 2 intro- duces background and related work. In Section 3, we explain our design. Experiments and evaluations are discussed in Section 4. The conclusion and future work are summarized in Section 5 Figure 2: Mapping scheme in Ceph. 2 BACKGROUND Each PG is mapped into OSDs using CRUSH. CRUSH is a pseudo- In this section, we first explain the architecture of Ceph and its random data distribution function that maps each PG into multiple mapping scheme. Then, we survey related work. OSDs based on the Ceph’s placement rules. The rules define the replication factor and any constraints on placement. By default, the replication factor is 3, meaning that each PG is mapped into 2.1 Ceph: Distributed Storage Cluster 3 OSDs, one as primary and the other two as replicas. During Figure 1 presents the logical structure of Ceph [2, 33]. It is a uni- mapping, CRUSH makes use of a map, called cluster map, that is a fied, distributed, object-based storage cluster designed for excellent compact and hierarchical description of OSDs, machines and racks performance, reliability and scalability. It consists of clients and in a storage cluster. By changing this cluster map, we can add new storage nodes, where a storage node is called as OSD (Object Stor- OSDs or remove failed OSDs. age Device) in Ceph. Each OSD takes charge of data manipulation Note that other distributed storage systems such as GFS [14] such as storing, retrieval and storage management. In the initial and HDFS [26] employ a master node to manage the mapping version of Ceph, an OSD is governed by a Ceph OSD daemon and between objects and OSDs. Using hash-based mapping scheme kernel-level local file systems such as Ext4 and XFS[33]. But, in instead of the dedicated master node gives several benefits. First, recent version, it is controlled by a key value store and user-level this algorithmic approach can improve performance by accessing file system, called BlueStore and BlueFS, respectively [2]. OSDs directly from clients without checking the master node to There are two core components in Ceph. One is the RADOS lookup mapping information. Second, it can avoid a single point of (Reliable Autonomic Distributed Object Store) framework [5]. It failure, enhancing reliability. Finally, the pseudo-random property supports several features including self-healing, self-managing, of CRUSH can distribute objects uniformly across OSDs, giving a strong consistency and dynamic cluster expansion.

SLA-Aware Adaptive Mapping Scheme in Bigdata Distributed Storage Systems

Ceph: Distributed Storage for Cloud Infrastructure

Open Source Software

Digitalocean

Ceph, Programmable Storage, and Data Fabrics Carlos Maltzahn, UC Santa Cruz Fermilab, 6/9/17 Carlos Maltzahn Background

File Systems Unfit As Distributed Storage Back Ends

Ceph Parallel File System Evaluation Report

A Tool for Modeling, Viewing, and Checking Distributed System

Open Source Software License Information

Summer 2020 Vol

Licence Agreement Supplement

Delivering Openstack and Ceph in Containers

VPP: the Ultimate NFV Vswitch (And More!)?