A Multi-Criteria Approach for Large-Object Cloud Storage
Total Page:16
File Type:pdf, Size:1020Kb
A Multi-criteria Approach for Large-object Cloud Storage Uwe Hohenstein1, Michael C. Jaeger1 and Spyridon V. Gogouvitis2 1Corporate Technology, Siemens AG Munich, Germany 2Mobility Management, Siemens AG Munich, Germany Keywords: Cloud Storage, Federation, Multi-criteria. Abstract: In the area of storage, various services and products are available from several providers. Each product pos- sesses particular advantages of its own. For example, some systems are offered as cloud services, while others can be installed on premises, some store redundantly to achieve high reliability while others are less reliable but cheaper. In order to benefit from the offerings at a broader scale, e.g., to use specific features in some cases while trying to reduce costs in others, a federation is beneficial to use several storage tools with their individual virtues in parallel in applications. The major task of a federation in this context is to handle the heterogeneity of involved systems. This work focuses on storing large objects, i.e., storage systems for videos, database archives, virtual machine images etc. A metadata-based approach is proposed that uses the metadata associated with objects and containers as a fundamental concept to set up and manage a federation and to con- trol storage locations. The overall goal is to relieve applications from the burden to find appropriate storage systems. Here a multi-criteria approach comes into play. We show how to extend the object storage developed by the VISION Cloud project to support federation of various storage systems in the discussed sense. 1 INTRODUCTION servers. They heavily rely on distributing data across several computers and prefer a schema-less storage of The National Institute of Standards and Technology data with a relaxed consistency concept. Certainly, (NIST) states that ”Cloud computing is a model for the settled technology of relational databases - if de- enabling ubiquitous, convenient, on-demand network ployed in a cloud - is also a kind of cloud storage. access to a shared pool of configurable computing There are corresponding offerings from major Cloud resources (e.g., networks, servers, storage, applica- providers. And finally, Blob stores represent a further tions, and services) that can be rapidly provisioned type of storage that should be mentioned. and released with minimal management effort or ser- Hence, we notice an increasing heterogeneity vice provider interaction” (Mell and Grance, 2011). of storage technologies even within the same cate- Hence, cloud computing represents a provisioning gory, with further differences from vendor to ven- paradigm for resources in first place. dor, whether deployed on-premises or in the cloud, Cloud storage is certainly one important cloud re- whether used as Platform-as-a-Service (PaaS) or as a source that benefits from the major characteristics of special Virtual Machine on the IaaS level. Each indi- cloud computing (Fox et al., 2009) such as vidual cloud storage has virtues of its own. virtually unlimited storage space, To benefit at a broader scale, a combination of • no upfront commitment for investments into hard- storage solutions seems to be useful. Some impor- • ware and software licenses, and tant aspects in this context are in the sense of polyglot pay per use for the occupied storage. persistence (Sandalage and Fowler, 2012): • to use the most appropriate storage technology for The term cloud storage is mostly associated with • the recent technology of Not only SQL databases each specific problem; (NoSQL, 2017), which attained a lot of popularity. to reduce costs by using a public cloud, by choos- Implied by an adaptation to distributed systems and • ing an appropriate cloud provider, particularly un- cloud computing environments, NoSQL databases der consideration of the various and complex price follow a different approach than the traditional fix- schemes and underlying factors such as price/GB, schema based model provided by relational database data transfer or number of transactions; 75 Hohenstein, U., Jaeger, M. and Gogouvitis, S. A Multi-criteria Approach for Large-object Cloud Storage. DOI: 10.5220/0006432100750086 In Proceedings of the 6th International Conference on Data Science, Technology and Applications (DATA 2017), pages 75-86 ISBN: 978-989-758-255-4 Copyright © 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved DATA 2017 - 6th International Conference on Data Science, Technology and Applications to take into account access time and latency, e.g., 2 THE VISION CLOUD PROJECT • to use fast but expensive storage only when really needed but slow and cheap storage in other cases; VISION Cloud (Kolodner et al., 2011) was an EU to consider the differences between on-premises co-funded project for the development of metadata- • and cloud solutions with certain limitations, e.g., centric cloud storage solutions. The project developed the maximal database size, a reduced features set; a storage system and several domain applications to use a hybrid cloud for security and confiden- where the handling of rich metadata provides new in- • tiality issues, i.e., keeping confidential data in a novations. Domains targeted by VISION Cloud were private cloud while taking benefit from the public telecommunication, broadcasting and media, health cloud for non-confidential data; care, and IT application management. to shard data in general, e.g., according to geo- These domains share the need for an appropri- • graphic locations, costs etc. ate object storage system. The telecommunication, the broadcasting, and media domains envisage the These points are fully interleaved and demand storage of videos, the health care domain applica- for a multi-criteria approach. Therefore a model tion stores high resolutions diagnostics images, and is needed that captures the necessary information the IT application management stores virtual ma- and creates associations between all involved entities. chine images. They all share the need for grow- This information can be used to find an optimal data ing storage capacity and large storage consumption placement solution for an overall benefit. per object. The images of virtual machines in a In this paper, we base our work upon a metadata- data center grow bigger. The media domain is mov- based approach for a cloud storage scheme de- ing to Ultra High-Definition and 4K resolution con- veloped by the European funded VISION Cloud tent. And in the telecommunication domain, shar- project (Kolodner et al., 2011)(Kolodner (2) et al., ing of video messages turns into a trend as the mar- 2012). VISION Cloud aimed at developing next ket share of high-resolution camera equipped hand- generation technologies for storing large objects like sets grows (VISION-Cloud, 2011). All these domains videos and virtual machine images, accompanied by in VISION Cloud benefit from an object storage de- a content-centric access to storage. Following the veloped in the project and serve as a proof-of-concept: CDMI proposal (CDMI, 2010), the approach relies They require an increasing need of large capacity for upon objects and containers and offers first-class sup- the expectation to store a vast number of large objects port for metadata for these storage entities. and the ability to maintain rich metadata sets in order VISION Cloud has implemented a simple federa- to navigate and retrieve stored content. tion approach that provides some basic access to sev- Pursuing a metadata-centric approach, VISION eral storage solutions. We extend the original federa- Cloud stresses the ability to represent the type and for- tion approach to tackle the needs of a multi-criteria mat of the stored objects. With such an awareness of storage solution that attempts to combine different the storage, functionality can be triggered depending storage technologies. Indeed, having a federation, on the execution context and the currently processed it is possible to benefit from the advantages of vari- storage object. The automation based on the aware- ous storage solutions, private, on-premise and public ness also contributes to the ability to deal with a high clouds, access speed, best price offerings etc. while number of objects stored in an autonomous manner. the same way avoiding the disadvantages of a single storage. A federation approach can provide a unified 2.1 The VISION Concept of Metadata and location-independent access interface, i.e., trans- parency for data sources, while leaving the federation Classic approaches that handle large objects basically participants autonomous. organize files in a hierarchical structure in order to al- The remainder of this work is structured as fol- low navigating through the hierarchy and finally find- lows: Section 2 explains the VISION Cloud software ing a particular item. However, it can be quite dif- that is relevant and extended in this work: the concept, ficult to set up a hierarchy in an appropriate manner especially of using metadata, the storage interfaces, that provides flexible search options with acceptable and the storage architecture. The original cloud fed- access performance and intuitive categories for ever eration approach of VISION Cloud is also presented. increasing amounts of data. Moreover, such a hierar- We explain an extended federation approach, partic- chy has usually to be organized manually and is thus ularly the architectural setup, in Section 3, and con- prone to outdated or wrongly applied placements. In tinue with technical details in Section 4. Section 5 addition,