Object-Based Storage By: Kanishk Jain Dated: 10Th May, 2007

Total Page:16

File Type:pdf, Size:1020Kb

Object-Based Storage By: Kanishk Jain Dated: 10Th May, 2007 CSE 598D-Storage Systems Survey Object-Based Storage by: Kanishk Jain Dated: 10th May, 2007 This literature survey is the only work so far where the overall progress in object based storage technology has been analyzed, following its standardization. It also illustrates how the trends of recent research fit into the broad scope of the object based storage environment, thus bringing out the current status of the technology, while simultaneously providing a reality check for the concept. A special feature of this survey is an attempted comparison of existing object based file systems, mainly from a design perspective. Abstract Object-based storage is a new technology which provides intelligence at the storage device. The object storage device (OSD) interface has recently been standardized. The main characteristic of an OSD is “intelligent data layout”. Its advantages include scalability, security, reliability, performance and ease of management. This literature survey looks at the capabilities of object-based storage, and explores how it improves data sharing, security, and device intelligence. It analyzes various aspects of object based storage such as file system design, application based optimizations and so on, in order to understand the advantages of this upcoming storage technology. Many of the ideas presented in this survey suggest extensions to the OSD interface to enhance performance, security, quality of service and so on. Hence the interface is still evolving. 1. Introduction The evolution and stability of current storage interfaces (SCSI and ATA/IDE) has allowed continual advances in both storage devices and applications, without frequent changes to the standards. However, since the interface ultimately determines the functionality supported by the devices, current interfaces are holding system designers back. Storage technology has progressed to the point that a change in the device interface is needed. Object-based storage [1] is an emerging technology designed to address this problem. The OSD (object storage device) interface has recently been standardized (as the ANSI T10 Object-based Storage Devices Standard). The main characteristic of an OSD is “intelligent data layout”. A storage object is a logical collection of bytes on a storage device, with well- known methods for access, attributes describing characteristics of the data, and security policies that prevent unauthorized access. Unlike blocks, objects are of variable size and can be used to store entire data structures, such as files, database tables, medical images, or multimedia. Objects can be regarded as the convergence of two technologies: files and blocks. Files provide user applications with a higher-level storage abstraction that enables secure data sharing across different operating system platforms, but often at the cost of limited performance due to file server contention. Blocks offer fast, scalable access to shared data; but without a file server to authorize the I/O and maintain the metadata, this direct access comes at the cost of limited security and data sharing. Objects can provide the advantages of both files and blocks. Like blocks, objects are a primitive unit of storage that can be directly accessed on a storage device (i.e., without going through a server); this direct access offers performance advantages similar to blocks. Like files, objects are accessed using an interface that abstracts storage applications from the metadata necessary to store the object, thus making the object easily accessible across different platforms. Providing direct, file-like access to storage devices is therefore the key contribution of object- based storage. Figure 1 shows the OSD model and the object interface it provides. [9] provides a detailed description of the advantages of the OSD model. An illustration of the hierarchy of OSD objects and attributes can be found in [2]. Figure 1: Object Storage Device model and interface This literature survey looks at the capabilities of object-based storage, and explores how it improves data sharing, security, and device intelligence. The rest of the paper is organized as follows. Section 2 outlines the advantages of object based storage in a cluster computing environment. Section 3 analyses object based storage from a file systems perspective. Section 4 talks about the use of application-specific attributes and illustrates an example of their use in database storage management. Section 5 indicates trends of recent research in object based storage. Section 6 gives the related work. Finally, section 7 concludes the paper. 2. Object Based Storage for Cluster Computing Instead of using proprietary, expensive supercomputers to solve the most challenging computing problems, nearly every new supercomputing system installed today is comprised of thousands of low-cost Linux servers united into a cluster. Supercomputing applications, apart from having high computational complexity, have a need for high-performance data access. Without rapid and efficient access to data, scarce computing resources sit idle. Traditional networked storage systems are simply incapable of providing the data throughput needed to keep ever growing Linux clusters operating efficiently. Equally important, these massive datasets need to be made globally available to all processes executing across the compute cluster to simplify application development and to ease the burden of managing data repositories. Here again, traditional networked storage systems fall short: they are incapable of scaling capacity within a single namespace and thereby increase the time and complexity of managing networked data. To understand the need for a new approach to scalable storage, it is essential to explore the manner in which many cluster computing applications address the storage bottleneck. Linux cluster applications use a scale-out approach to parallel computing. In this model, applications employ a 'divide-and-conquer' approach, decomposing the problem to be solved into thousands of independently executed tasks. The most common decomposition approach exploits a problem's inherent data parallelism-- breaking the problem into pieces by identifying the data partitions that comprise the individual task, then distributing each task and corresponding partition to the compute nodes for processing. The natural inclination of cluster computing developers is to deploy a networked storage solution that can be accessed by all nodes in the cluster. Such a solution greatly simplifies management of the compute jobs as all data partitions and replicas can be made available to all nodes, and hence any of the tasks can be computed on any node. Additionally, the output of these jobs can then be used directly elsewhere: in post-processing, visualization or even as the input to the next processing task in a computational pipeline. However, neither storage area networks (SAN) nor network attached storage (NAS) architectures support the aggressive concurrency and per- client throughput requirements of scalable cluster computing applications [1,4]. Figure 2 illustrates NAS being used to share files among a number of clients. The files themselves may be stored on a fast SAN. However, because the clients often suffer from queuing delays at the server, they rarely see the full performance of the SAN. The file server is used to intermediate all requests and thus becomes the bottleneck. Figure 2: The NAS architecture Figure 3: The SAN architecture Figure 3 shows a SAN file system being used to share files among a number of clients. The files themselves are stored on a fast storage area network (e.g., iSCSI) to which the clients are also attached. File server queuing delays are avoided by having the file server share metadata with the clients who can directly access the storage devices. However, since the devices cannot authorize I/O, the file server must assume that the clients are trusted. Hence while the file server is removed as a bottleneck, security is a concern. Because of these limitations, organizations are forced to adopt a process in which data from a shared storage system is staged (copied) to the compute nodes, processing is performed, and results are de-staged from the nodes back to shared storage when done. In many applications, the staging setup time can be appreciable-up to several hours for large clusters. Object-based storage clustering [4] is useful in unlocking the full potential of these Linux compute clusters, as object storage clusters have the intrinsic ability to linearly scale in capacity and performance to meet the demands of the supercomputing applications (the scalability of the object-storage architecture is explained in detail in [9]). Object-based storage offers high bandwidth parallel data access between thousands of Linux cluster nodes and a unified storage cluster over standard TCP/IP networks. It is a solution in which the storage system's scalability can be precisely matched and then scaled to needs of the cluster computer. Together, Linux clusters and object-based storage clusters deliver commodity-like supercomputers able to keep pace with increasingly voracious applications. In an object-based storage architecture, the dynamic, self-managing data objects are stored across a cluster of intelligent object storage devices (OSDs). Data objects are fundamental containers that house both application data (including metadata describing the "mapping" of object data to physical disk drives) and an extensible set of storage attributes (application specific
Recommended publications
  • Altavault 4.4 Administration Guide
    Beta Draft NetApp® AltaVault™ Cloud Integrated Storage 4.4 Administration Guide NetApp, Inc. Telephone: +1 (408) 822-6000 Part number: 215-12478_A0 495 East Java Drive Fax: + 1 (408) 822-4501 November 2017 Sunnyvale, CA 94089 Support telephone: +1(888) 463-8277 U.S. Web: www.netapp.com Feedback: [email protected] Beta Draft Contents Beta Draft Contents Chapter 1 - Introduction of NetApp AltaVault Cloud Integrated Storage ............................................ 11 Overview of AltaVault....................................................................................................................................11 Supported backup applications and cloud destinations...........................................................................11 AutoSupport ............................................................................................................................................11 System requirements and specifications.........................................................................................................11 Documentation and release notes ...................................................................................................................12 Chapter 2 - Deploying the AltaVault appliance ......................................................................................13 Deployment guidelines ...................................................................................................................................13 Basic configuration.........................................................................................................................................15
    [Show full text]
  • The Parallel File System Lustre
    The parallel file system Lustre Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT – University of the State Rolandof Baden Laifer-Württemberg – Internal and SCC Storage Workshop National Laboratory of the Helmholtz Association www.kit.edu Overview Basic Lustre concepts Lustre status Vendors New features Pros and cons INSTITUTSLustre-, FAKULTÄTS systems-, ABTEILUNGSNAME at (inKIT der Masteransicht ändern) Complexity of underlying hardware Remarks on Lustre performance 2 16.4.2014 Roland Laifer – Internal SCC Storage Workshop Steinbuch Centre for Computing Basic Lustre concepts Client ClientClient Directory operations, file open/close File I/O & file locking metadata & concurrency INSTITUTS-, FAKULTÄTS-, ABTEILUNGSNAME (in der Recovery,Masteransicht ändern)file status, Metadata Server file creation Object Storage Server Lustre componets: Clients offer standard file system API (POSIX) Metadata servers (MDS) hold metadata, e.g. directory data, and store them on Metadata Targets (MDTs) Object Storage Servers (OSS) hold file contents and store them on Object Storage Targets (OSTs) All communicate efficiently over interconnects, e.g. with RDMA 3 16.4.2014 Roland Laifer – Internal SCC Storage Workshop Steinbuch Centre for Computing Lustre status (1) Huge user base about 70% of Top100 use Lustre Lustre HW + SW solutions available from many vendors: DDN (via resellers, e.g. HP, Dell), Xyratex – now Seagate (via resellers, e.g. Cray, HP), Bull, NEC, NetApp, EMC, SGI Lustre is Open Source INSTITUTS-, LotsFAKULTÄTS of organizational-, ABTEILUNGSNAME
    [Show full text]
  • What Is Object Storage?
    What is Object Storage? What is object storage? How does object storage vs file system compare? When should object storage be used? This short paper looks at the technical side of why object storage is often a better building block for storage platforms than file systems are. www.object-matrix.com Object Matrix Ltd [email protected] Experts in Digital Content Governance & Object Storage +44(0)2920 382 308 The Rise of Object Storage Centera the trail blazer… What exactly Object Storage is made of will be discussed later; its benefits and its limitations included. But first of all a brief history of the rise of Object Storage: Concepts around object storage can be dated back to the 1980’s1 , but it wasn’t until around 2002 when EMC launched Centera to the world – a Content Addressable Storage product2 - that there was an object storage product for the world in general3. However, whilst Centera sold well – some sources say over 600PB were sold – there were fundamental issues with the product. In, 2005 I had a meeting with a “next Companies railed against having to use a “proprietary API” for data generation guru” of a top 3 storage access and a simple search on a search engine shows that company, and he boldly told me: “There is Centera had plenty of complaints about its performance. It wasn’t no place for Object Storage. Everything long until the industry was calling time on Centera and its “content you can do on object storage can be addressable storage” (CAS) version of object storage: not only done in the filesystem.
    [Show full text]
  • IBM Cloud Object Storage System On-Premises Features and Benefits Object Storage to Help Solve Terabytes-And-Beyond Storage Challenges
    IBM Cloud Cloud Object Storage Data Sheet IBM Cloud Object Storage System on-premises features and benefits Object storage to help solve terabytes-and-beyond storage challenges The IBM® Cloud Object Storage System™ is a breakthrough platform Highlights that helps solve unstructured data challenges for companies worldwide. It is designed to provide scalability, availability, security, • On-line scalability that offers a single manageability, flexibility, and lower total cost of ownership (TCO). storage system and namespace • Security features include a wide The Cloud Object Storage System is deployed in multiple range of capabilities designed to meet configurations as shown in Figure 1. Each node consists of Cloud security requirements Object Storage software running on an industry-standard server. • Reliability and availability characteristics Cloud Object Storage software is compatible with a wide range of of the system are configurable servers from many sources, including a physical or virtual appliance. • Single copy protected data with In addition, IBM conducts certification of specific servers that geo-dispersal creating both efficiency customers want to use in their environment to help insure quick and manageability initial installation, long-term reliability and predictable performance. • Compliance enabled vaults for when compliance requirements or locking down data is required Data source Multiple concurrent paths ACCESSER® LAYER ......... Multiple concurrent paths SLICESTOR® LAYER ......... Figure 1: Multiple configurations of
    [Show full text]
  • File Systems and Storage
    FILE SYSTEMS AND STORAGE On Making GPFS Truly General DEAN HILDEBRAND AND FRANK SCHMUCK Dean Hildebrand manages the PFS (also called IBM Spectrum Scale) began as a research project Cloud Storage Software team that quickly found its groove supporting high performance comput- at the IBM Almaden Research ing (HPC) applications [1, 2]. Over the last 15 years, GPFS branched Center and is a recognized G expert in the field of distributed out to embrace general file-serving workloads while maintaining its original and parallel file systems. He pioneered pNFS, distributed design. This article gives a brief overview of the origins of numer- demonstrating the feasibility of providing ous features that we and many others at IBM have implemented to make standard and scalable access to any file GPFS a truly general file system. system. He received a BSc degree in computer science from the University of British Columbia Early Days in 1998 and a PhD in computer science from Following its origins as a project focused on high-performance lossless streaming of multi- the University of Michigan in 2007. media video files, GPFS was soon enhanced to support high performance computing (HPC) [email protected] applications, to become the “General Parallel File System.” One of its first large deployments was on ASCI White in 2002—at the time, the fastest supercomputer in the world. This HPC- Frank Schmuck joined IBM focused architecture is described in more detail in a 2002 FAST paper [3], but from the outset Research in 1988 after receiving an important design goal was to support general workloads through a standard POSIX inter- a PhD in computer science face—and live up to the “General” term in the name.
    [Show full text]
  • DAOS: Revolutionizing High-Performance Storage with Intel® Optane™ Technology
    SOLUTION BRIEF Distributed Asynchronous Object Storage (DAOS) Intel® Optane™ Technology DAOS: Revolutionizing High-Performance Storage with Intel® Optane™ Technology With the exponential growth of data, distributed storage systems have become not only the heart, but also the bottleneck of data centers. High-latency data access, poor scalability, difficulty managing large datasets, and lack of query capabilities are just a few examples of common hurdles. Traditional storage systems have been designed for rotating media and for POSIX* input/output (I/O). These storage systems represent a key performance bottleneck, and they cannot evolve to support new data models and next-generation workflows. The Convergence of HPC, Big Data, and AI Storage requirements have continued to evolve, with the need to manipulate ever-growing datasets driving a further need to remove barriers between data and compute. Storage is no longer driven by traditional workloads with large streaming writes like checkpoint/restart, but is increasingly driven by complex I/O patterns from new storage pillars. High-performance data-analytics workloads are generating vast quantities of random reads and writes. Artificial-intelligence (AI) workloads are reading far more than traditional high-performance computing (HPC) workloads. Data streaming from instruments into an HPC cluster require better quality of service (QoS) to avoid data loss. Data-access time is now becoming as critical as write bandwidth. New storage semantics are required to query, analyze, filter, and transform datasets. A single storage platform in which next-generation workflows combine HPC, big data, and AI to exchange data and communicate is essential. DAOS Software Stack Intel has been building an entirely open source software ecosystem for data-centric computing, fully optimized for Intel® architecture and non-volatile memory (NVM) technologies, including Intel® Optane™ DC persistent memory and Intel Optane DC SSDs.
    [Show full text]
  • Capsule: an Energy-Optimized Object Storage System for Memory-Constrained Sensor Devices
    Capsule: An Energy-Optimized Object Storage System for Memory-Constrained Sensor Devices Gaurav Mathur, Peter Desnoyers, Deepak Ganesan, Prashant Shenoy {gmathur, pjd, dganesan, shenoy}@cs.umass.edu Department of Computer Science University of Massachusetts, Amherst, MA 01003 Abstract 1 Introduction Recent gains in energy-efficiency of new-generation NAND flash stor- Storage is an essential ingredient of any data-centric sen- age have strengthened the case for in-network storage by data-centric sensor sor network application. Common uses of storage in sen- network applications. This paper argues that a simple file system abstrac- sor applications include archival storage [9], temporary data tion is inadequate for realizing the full benefits of high-capacity low-power storage [6], storage of sensor calibration tables [10], in- NAND flash storage in data-centric applications. Instead we advocate a network indexing [20], in-network querying [19] and code rich object storage abstraction to support flexible use of the storage system storage for network reprogramming [7], among others. Until for a variety of application needs and one that is specifically optimized for recently, sensor applications and systems were designed un- memory and energy-constrained sensor platforms. We propose Capsule, an der the assumption that computation is significantly cheaper energy-optimized log-structured object storage system for flash memories that both communication and storage, with the latter two in- that enables sensor applications to exploit storage resources in a multitude of curring roughly equal costs. However, the emergence of a ways. Capsule employs a hardware abstraction layer that hides the vagaries new generation of NAND flash storage has significantly al- of flash memories for the application and supports energy-optimized imple- tered this trade-off, with a recent study showing that flash mentations of commonly used storage objects such as streams, files, arrays, storage is now two orders of magnitude cheaper than com- queues and lists.
    [Show full text]
  • IBM Cloud Object Storage System: Release Notes Support Information
    IBM Cloud Object Storage System Version 3.8.3 Release Notes IBM COSRN_3800-20160516L This edition applies to IBM Cloud Object Storage System™ and is valid until replaced by new editions. © Copyright IBM Corporation Copyright, 2016-2017. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Support information ......... v Chapter 11. Resolved issues in 3.8.1 .. 23 Chapter 1. Release Notes 3.8.3 ..... 1 Chapter 12. Known issues ...... 25 Upgrading............... 26 Chapter 2. New Features and Alerting and Reporting .......... 27 Improvements in ClevOS 3.8.3 ..... 3 System Behavior............. 27 Storage Pools .............. 28 Security ............... 28 Chapter 3. New Features and Data Evacuation ............. 29 Improvements in ClevOS 3.8.2 ..... 5 System Configuration ........... 29 Deleting objects ............. 29 Chapter 4. New Features and Manager Web Interface .......... 30 Improvements in ClevOS 3.8.1 ..... 7 Vaults ................ 30 Vault mirrors .............. 30 Chapter 5. New Features and Vault migration ............. 30 Installation .............. 31 Improvements in ClevOS 3.8.0 ..... 9 Native File ............. 31 Chapter 6. Resolved issues in 3.8.3 Chapter 13. Supported Hardware January Maintenance Release..... 13 Platforms ............. 33 IBM Cloud Object Storage Appliances ..... 33 Chapter 7. Resolved issues in 3.8.3 Hewlett Packard............. 33 December Maintenance Release .... 15 Seagate ................ 34 Cisco ................ 34 Chapter 8. Resolved issues in 3.8.3 .. 17 Notices .............. 35 Chapter 9. Resolved issues in 3.8.2 Trademarks .............. 37 October Maintenance Release..... 19 Chapter 10. Resolved issues in 3.8.2 21 © Copyright IBM Corp. Copyright, 2016-2017 iii iv IBM Cloud Object Storage System: Release Notes Support information For more information on the product or help with troubleshooting, contact IBM Support at [email protected] or visit the Directory of worldwide contacts.
    [Show full text]
  • Cohesity Dataplatform & Dataprotect Version 6.3.1E Security Target
    Cohesity DataPlatform & DataProtect Version 6.3.1e Security Target Version 1.3 7 May 2020 Prepared for: 300 Park Ave Suite 1700 San Jose, CA 95110 Prepared By: Accredited Testing and Evaluation Labs 6841 Benjamin Franklin Drive Columbia, MD 21046 TABLE OF CONTENTS 1. INTRODUCTION ........................................................................................................................................... 1 1.1 SECURITY TARGET, TOE AND CC IDENTIFICATION...................................................................................... 1 1.2 CONFORMANCE CLAIMS ............................................................................................................................. 1 1.3 CONVENTIONS ........................................................................................................................................... 1 1.4 ABBREVIATIONS AND ACRONYMS ............................................................................................................... 2 2. TOE DESCRIPTION ...................................................................................................................................... 4 2.1 OVERVIEW ................................................................................................................................................. 4 2.2 ARCHITECTURE .......................................................................................................................................... 4 2.3 PHYSICAL BOUNDARIES ............................................................................................................................
    [Show full text]
  • Inside the Lustre File System
    Inside The Lustre File System Technology Paper An introduction to the inner workings of the world’s most scalable and popular open source HPC file system Torben Kling Petersen, PhD Inside The Lustre File System The Lustre High Performance Parallel File System Introduction Ever since the precursor to Lustre® (known as the Object- Based Filesystem, or ODBFS) was developed at Carnegie Mellon University in 1999, Lustre has been at the heart of high performance computing, providing the necessary throughput and scalability to many of the fastest supercomputers in the world. Lustre has experienced a number of changes and, despite the code being open source, the ownership has changed hands a number of times. From the original company started by Dr. Peter Braam (Cluster File Systems, or CFS), which was acquired by Sun Microsystems in 2008—which was in turn acquired by Oracle in 2010—to the acquisition of the Lustre assets by Xyratex in 2013, the open source community has supported the proliferation and acceptance of Lustre. In 2011, industry trade groups like OpenSFS1, together with its European sister organization, EOFS2, took a leading role in the continued development of Lustre, using member fees and donations to drive the evolution of specific projects, along with those sponsored by users3 such as Oak Ridge National Laboratory, Lawrence Livermore National Laboratory and the French Atomic Energy Commission (CEA), to mention a few. Today, in 2014, the Lustre community is stronger than ever, and seven of the top 10 high performance computing (HPC) systems on the international Top 5004 list (as well as 75+ of the top 100) are running the Lustre high performance parallel file system.
    [Show full text]
  • Red Hat Ceph* Storage and Intel®
    RED HAT CEPH STORAGE AND INTEL CACHE ACCELERATION SOFTWARE Accelerating object storage with the Intel SSD Data Center family SOLUTION OVERVIEW INTRODUCTION To manage massive data growth, organizations are increasingly choosing object storage solutions, allowing them to scale storage flexibly while controlling costs. Ceph is a popular solution, letting organizations deploy cost-effective industry-standard hardware as a part of proven software- defined storage infrastructure. With this shift, the storage media itself has become a key consider- Achieve flash-accelerated object ation. Traditional hard disk drives (HDDs) are affordable, but often lack the desired input/output (I/O) storage performance at lower performance for demanding workloads, such as storing large numbers of objects. Proprietary all- costs than proprietary flash arrays offer performance, but can be cost-prohibitive for large-scale deployments. all-flash array solutions. Red Hat® Ceph Storage combined with Intel® Solid State Drives (SSDs) Data Center family and Intel® Cache Acceleration Software (CAS) has emerged as a compelling option. Organizations can use Intel Use the Intel SSD Data Center CAS to selectively classify key portions of a given I/O workload for acceleration with the high-per- family and Intel CAS to intelli- formance and low latency of flash storage. The performance difference can be remarkable. When gently prioritize I/O for caching. testing large object-count storage workloads, Red Hat saw performance improvements of up to 400% for small-object writes when using Intel SSDs and Intel CAS.1 The solution is also cost-effec- Confidently deploy software- tive, achieving strong performance results with only 2-4 SSDs per system.
    [Show full text]
  • RED HAT CEPH STORAGE an Open, Software-Defined Storage Platform for the Cloud
    RED HAT CEPH STORAGE An open, software-defined storage platform for the cloud DATASHEET OVERVIEW Red Hat® Ceph Storage is a massively scalable, open, software-defined storage platform that: • Is designed for cloud infrastructures and web-scale object storage. Combines the most stable version of Ceph with a Ceph management platform, deployment tools, and support services. • Provides the tools to flexibly, automatically, and cost-effectively manage petabyte-scale data deployments in the enterprise. • Manages cloud data so enterprises can focus on managing their businesses. Deployment tools Ceph management Development tools platform Wizard to ease installation of Ceph management platform On-premise, web-based Local repository SLA-backed technical support application with dependencies Bug escalation Cluster monitoring and Cluster bootstrapping tool “Hot patches” management Roadmap input RESTful API CEPH OBJECT AND CEPH BLOCK 100% open source Production hardened Supported packages 18 months of support ST0038 facebook.com/redhatinc @redhatnews linkedin.com/company/red-hat redhat.com Specifically, Red Hat Ceph Storage 1.3 consists of: Ceph 0.80 • Ceph object and block storage • Object storage via Amazon S3/Swift or native API protocols • Block storage integrated with OpenStack, Linux®, and open hypervisors • Multisite and disaster recovery options • Flexible storage policies VALUE • Data durability via erasure coding or replication • Significantly lowers storage • Cache tiering for hot/cold data management cost per gigabyte Ceph management •
    [Show full text]