Implementation Guide
Service Provider Data Center For Disclosure under NDA Only
Scalable Cloud-based Services for Flexible, Robust, Cost- Effective Storage
Discover how to improve storage utilization, reduce cost and improve scalability with a Ceph*-based storage-as-a-service (STaaS) solution optimized for Intel® technology
Introduction This implementation guide provides key learnings and configuration Storage-as-a-service (STaaS) uses software-defined storage (SDS) to abstract insights to integrate technologies storage software from the storage hardware. By providing a shared pool of storage with optimal business value. capacity that can be used across service offerings, SDS eliminates storage silos If you are responsible for… and helps improve utilization ratios. Intelligent, automated orchestration reduces • Technology decisions: operating costs and can speed provisioning from several weeks to a few minutes. You will learn how to implement a storage-as-a-service (STaaS) Ceph* is an open source STaaS solution that supports object, block and file storage. solution using Ceph*. You’ll Using an open source solution can help you lower costs and avoid vendor lock-in. also find tips for optimizing It also enables you to deploy new technology quickly. Ceph is well supported by performance with Intel® the open source community, and commercial distributions are also available. Intel, technologies and best practices the community and Independent Software Vendors (ISVs) have worked closely to for deploying Ceph. develop reference architectures and best practices for deploying a Ceph-based STaaS platform that is optimized to run on Intel® Xeon® processors and take advantage of other Intel® technologies such as Intel® SSD Data Center Family for NVMe* (Non-Volatile Memory Express*) Solid State Drives (SSDs), Intel® Ethernet Products and software optimizations such as Intel® Intelligent Storage Acceleration Library (Intel® ISA-L) and Intel® Cache Acceleration Software (Intel® CAS).
Overview equirements Configuration Operation Use Cases Validation
For Disclosure under NDA Only Implementation Guide | Scalable Cloud-based Services for Flexible, Robust, Cost-Effective Storage 2
Table of Contents Solution Overview Introduction ...... 1 A Ceph-based STaaS deployment consists of the Ceph software, several types of Solution Overview...... 2 nodes (servers), and Intel software optimization products. Ceph Software...... 2 Ceph Software Node Types...... 2 Intel® Technologies...... 2 Ceph’s foundation is the Reliable Autonomic Distributed Object Store* (RADOS*), which provides your applications with object, block and file system storage in System Requirements ...... 3 a single unified storage cluster—making Ceph flexible, highly reliable and easy Software Requirements...... 3 for you to manage. Each one of your applications can use the object, block or Minimum Hardware Requirements. . 3 file system interfaces to the same RADOS cluster simultaneously, which means Installation and Configuration. . . . 5 your Ceph storage system serves as a flexible foundation for all of your data Get Ceph...... 5 storage needs. You can use Ceph for free because it is open source, and deploy Install Ceph ...... 5 it on economical industry-standard hardware. Or you can opt for a commercially Deploy Storage Clusters...... 6 supported Ceph distribution if you prefer. Deploy Ceph Clients...... 6 The various storage access modes use different components of Ceph: Configuration Considerations. . . . .6 • Object storage uses the Ceph Object Gateway daemon, radosgw* (RGW*). Ceph Operation and Utilization. . . 8 • File storage (CephFS*) can use a Ceph filesystem kernel driver or the user space Ceph Topologies...... 8 FUSE* client. Using Intel CAS...... 10 • Block storage uses RADOS Block Devices (RBDs*). Using Intel ISA-L...... 10 Integrating with OpenStack. . . . .10 • All storage is ultimately stored by Ceph Object Storage Daemons (Ceph OSDs). Orchestration ...... 11 • All data is stored as “objects” which are randomly distributed across the cluster Ceph-Based STaaS Use Cases. . . . . 11 by the CRUSH* (Controlled Replication Under Scalable Hashing*) algorithm. Validation...... 13 To efficiently compute information about object placement and location, Ceph uses the CRUSH algorithm instead of a central lookup table. CRUSH enables Ceph Best Practices...... 13 performance to scale linearly by ensuring that data is always retrieved directly from Planning...... 13 the primary OSD where it is stored—avoiding bottlenecks created by centralized Nodes...... 13 metadata lookups. Journaling...... 14 Network...... 14 Node Types Summary...... 14 As you build your Ceph cluster using the guidelines in this document, you will be References...... 14 working with several types of “nodes” (sometimes referred to as “hosts”). A node is simply any single machine or server in a Ceph system. Appendix A: Ceph Tuning Details. . . . 15 • Storage nodes (sometimes called “OSD nodes” or simply “OSDs”) are where the Solutions Proven by Your Peers . . . . 17 actual data is stored. • Monitor nodes track the health and configuration of the Ceph cluster by maintaining copies of the cluster maps. • RGW nodes serve as HTTP proxies for object storage workloads. • Metadata nodes map the directories and filenames from CephFS to objects stored within RADOS clusters. • Client nodes request data. See Figure 1 for an overview of how all these fit together into a Ceph cluster.
Intel® Technologies Several Intel technologies, both hardware and software, contribute to the reliability and performance of a Ceph-based STaaS solution. See the “References” section for links. • Intel® Xeon® processors and Intel® Xeon® processor Scalable family provide the compute power needed to process vast amounts of data. • Intel® SATA-based SSDs, Intel® NVMe*-based SSDs and Intel® Optane™ SSDs provide performance, stability, efficiency and low power consumption.
For Disclosure under NDA Only Implementation Guide | Scalable Cloud-based Services for Flexible, Robust, Cost-Effective Storage 3