Kunpeng BoostKit for SDS

Deployment Guides

Issue 05 Date 2021-03-23

HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. i Kunpeng BoostKit for SDS Deployment Guides Contents

Contents

1 - Deployment Guide (CentOS 7.6)...... 1 1.1 Introduction...... 1 1.2 Environment Requirements...... 2 1.3 Configuring the Deployment Environment...... 5 1.4 Installing ceph-ansible...... 11 1.5 Configuring Block Storage...... 18 1.6 Configuring File Storage...... 20 1.7 Configuring Object Storage...... 22 1.8 Deploying the Ceph Cluster...... 24 1.9 Expanding the Cluster...... 25 1.9.1 Adding Ceph MON Nodes...... 26 1.9.2 Adding OSD Nodes...... 26 1.9.2.1 Adding Ceph OSDs with the Same Drive Topology as Existing OSDs...... 26 1.9.2.2 Adding Ceph OSDs with a Different Drive Topology from Existing OSDs...... 27 1.9.3 Adding MDS Nodes...... 27 1.10 Deleting the Cluster...... 28 1.11 More Resources...... 28 2 Ceph Block Storage Deployment Guide (CentOS 7.6)...... 30 2.1 Introduction...... 30 2.2 Environment Requirements...... 32 2.3 Configuring the Deployment Environment...... 35 2.4 Installing Ceph...... 40 2.4.1 Installing the Ceph Software...... 41 2.4.2 Deploying MON Nodes...... 42 2.4.3 Deploying MGR Nodes...... 43 2.4.4 Deploying OSD Nodes...... 44 2.5 Verifying Ceph...... 46 2.5.1 Creating a Storage Pool...... 46 2.5.2 Creating Block Devices...... 47 2.5.3 Mapping Block Device Images...... 49 3 Ceph Block Storage Deployment Guide (openEuler 20.03)...... 51 3.1 Introduction...... 51

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. ii Kunpeng BoostKit for SDS Deployment Guides Contents

3.2 Environment Requirements...... 53 3.3 Configuring the Deployment Environment...... 56 3.4 Installing Ceph...... 62 3.4.1 Installing the Ceph Software...... 62 3.4.2 Deploying MON Nodes...... 64 3.4.3 Deploying MGR Nodes...... 65 3.4.4 Deploying OSD Nodes...... 66 3.5 Verifying Ceph...... 68 3.5.1 Creating a Storage Pool...... 68 3.5.2 Creating Block Devices...... 69 3.5.3 Mapping Block Device Images...... 71 4 Ceph Object Storage Deployment Guide (CentOS 7.6)...... 73 4.1 Introduction...... 73 4.2 Environment Requirements...... 75 4.3 Configuring the Deployment Environment...... 78 4.4 Installing Ceph...... 83 4.4.1 Installing the Ceph Software...... 84 4.4.2 Deploying MON Nodes...... 85 4.4.3 Deploying MGR Nodes...... 86 4.4.4 Deploying OSD Nodes...... 87 4.5 Verifying Ceph...... 89 4.5.1 Deploying RGW Nodes...... 89 4.5.2 Creating a Storage Pool...... 92 4.5.3 Creating an RGW Account...... 96 4.5.4 Enabling RGW Data Compression...... 97 5 Ceph Object Storage Deployment Guide (openEuler 20.03)...... 102 5.1 Introduction...... 102 5.2 Environment Requirements...... 104 5.3 Configuring the Deployment Environment...... 107 5.4 Installing Ceph...... 113 5.4.1 Installing the Ceph Software...... 113 5.4.2 Deploying MON Nodes...... 115 5.4.3 Deploying MGR Nodes...... 116 5.4.4 Deploying OSD Nodes...... 117 5.5 Verifying Ceph...... 119 5.5.1 Deploying RGW Nodes...... 119 5.5.2 Creating a Storage Pool...... 122 5.5.3 Creating an RGW Account...... 126 5.5.4 Enabling RGW Data Compression...... 127 6 Ceph File Storage Deployment Guide (CentOS 7.6)...... 132 6.1 Introduction...... 132

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. iii Kunpeng BoostKit for SDS Deployment Guides Contents

6.2 Environment Requirements...... 134 6.3 Configuring the Deployment Environment...... 137 6.4 Installing Ceph...... 142 6.4.1 Installing the Ceph Software...... 143 6.4.2 Deploying MON Nodes...... 144 6.4.3 Deploying MGR Nodes...... 145 6.4.4 Deploying OSD Nodes...... 146 6.5 Verifying Ceph...... 148 6.5.1 Configuring MDS Nodes...... 148 6.5.2 Creating Storage Pools and a File System...... 149 6.5.3 Mounting the File System to the Clients...... 150 7 Ceph File Storage Deployment Guide (openEuler 20.03)...... 152 7.1 Introduction...... 152 7.2 Environment Requirements...... 154 7.3 Configuring the Deployment Environment...... 157 7.4 Installing Ceph...... 163 7.4.1 Installing the Ceph Software...... 163 7.4.2 Deploying MON Nodes...... 165 7.4.3 Deploying MGR Nodes...... 166 7.4.4 Deploying OSD Nodes...... 167 7.5 Verifying Ceph...... 169 7.5.1 Configuring MDS Nodes...... 169 7.5.2 Creating Storage Pools and a File System...... 170 7.5.3 Mounting the File System to the Clients...... 171 8 Ceph Automatic Deployment Guide (CentOS 7.6)...... 173 8.1 Introduction...... 173 8.2 Configuring the Deployment Environment...... 175 8.2.1 Configuring the Physical Environment...... 175 8.2.2 Configuring Software...... 176 8.2.3 Setting Installation Parameters...... 178 8.2.4 Verifying the Installation Parameters...... 179 8.3 Installing Ceph...... 179 A Change History...... 182

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. iv Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

1 Ceph-Ansible Deployment Guide (CentOS 7.6)

1.1 Introduction 1.2 Environment Requirements 1.3 Configuring the Deployment Environment 1.4 Installing ceph-ansible 1.5 Configuring Block Storage 1.6 Configuring File Storage 1.7 Configuring Object Storage 1.8 Deploying the Ceph Cluster 1.9 Expanding the Cluster 1.10 Deleting the Cluster 1.11 More Resources

1.1 Introduction ceph-ansible Overview

ceph-ansible contains Ansible playbooks used to deploy the Ceph distributed system.

Ansible is an automatic O&M tool developed in Python. It integrates the advantages of various O&M tools (such as Puppet, CFEngine, Chef, Func, and Fabric) to implement functions such as system configuration, program deployment, and command execution in batches.

Ansible works based on modules and does not provide the batch deployment capability. It is the modules run by Ansible that provide the batch deployment capability, whereas Ansible only provides a framework.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 1 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Recommended Version

stable-4.0

Deployment Process

Figure 1-1 shows the deployment process.

Figure 1-1 Deployment process

1.2 Environment Requirements

Hardware Requirements

Table 1-1 lists the hardware requirements.

Table 1-1 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Cores 2 x 32-core

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 2 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

CPU Frequency 2600 MHz

Memory Size 12 x 16 GB

Memory Frequency 2666 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drive System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID Controller Card LSI SAS3508

Software Requirements Table 1-2 lists the software requirements.

Table 1-2 Software requirements Software Version

OS CentOS release 7.6.1810 Installation mode: Infrastructure servers + development tools

Ceph 14.2.1 Nautilus

Ansible 2.8.5

ceph-ansible stable-4.0

Cluster Environment Planning Figure 1-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 3 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Figure 1-2 Physical networking diagram

Table 1-3 lists the cluster deployment plan.

Table 1-3 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 1-4 lists the client deployment plan.

Table 1-4 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 4 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

1.3 Configuring the Deployment Environment

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way.

Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file: 192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Configuring Password-Free Login

Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 5 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 6 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Disabling the Firewall On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Disabling SELinux Disable SELinux on each server node and client node. ● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Configuring Repo Sources This section provides online and offline installation modes for Repo sources. The online installation mode is recommended. Method 1: online installation

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 7 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

Step 3 Install the EPEL source. yum -y install epel-release

Step 4 Modify the proxy configuration of all nodes. vim /etc/environment Add the following information to support the installation of related dependencies:

export http_proxy=http://{Proxy-User-Name}:{Proxy-Password}@: export https_proxy= http://{Proxy-User-Name}:{Proxy-Password}@: export ftp_proxy= http://{Proxy-User-Name}:{Proxy-Password}@: export no_proxy=127.0.0.1,localhost

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 8 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Method 2: offline installation Perform the following operations on each node to configure Repo sources for all nodes in the cluster:

NO TE

In this version, you need to manually create the source.zip package. For details, see Creating a Repo Source Package.

Step 1 Upload the source.zip package to the /home directory and go to the directory. cd /home Step 2 Decompress the package. unzip source.zip

Step 3 Install createrepo. yum install -y createrepo/*.rpm

Step 4 Create local sources. cd /home/local_source createrepo .

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 9 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Step 5 Go to the yum.repo.d directory. cd /home/yum.repo.d

Step 6 Remove the backup of the .repo file provided by the system. mkdir bak mv *.repo bak

Step 7 Create a .repo file. vi local.repo

Add the following information to the file: [local] name=local baseurl=file:///home/local_source enabled=1 gpgcheck=0

----End

Configuring NTP Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 10 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit.

Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

1.4 Installing ceph-ansible

Installing ceph-ansible

Both online and offline installation modes are supported. You only need to perform the following operations on ceph 1.

Method 1: online installation

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 11 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Step 1 Install Ansible. yum -y install ansible

Step 2 Install ceph-ansible. 1. Ensure that Git has been installed in the environment. You can run the following command to install Git: yum -y install git

2. Set http.sslVerify to false to skip the verification of the system certificate. git config --global http.sslVerify false 3. Download the ceph-ansible package. git clone -b stable-4.0 https://github.com/ceph/ceph-ansible.git --recursive

----End

Method 2: offline installation

Step 1 Install Ansible. yum -y install ansible

Step 2 Decompress the ceph-ansible package. unzip /home/ceph-ansible-stable-4.0.zip

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 12 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

----End

Installing ceph-ansible Dependencies Both online and offline installation modes are supported. You only need to perform the following operations on ceph 1. Method 1: online installation

Step 1 Install Python . yum install -y python-pip

Step 2 Upgrade the pip version to 19.3.1. pip install --upgrade pip

Step 3 Go to the ceph-ansible directory. cd /home/ceph-ansible/

Step 4 Check and install the required software versions. pip install -r requirements.txt

----End Method 2: offline installation

Step 1 Install Python pip. yum install -y python-pip

Step 2 Upgrade the pip version to 19.3.1.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 13 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

pip install pip-19.3.1-py2.py3-none-any.whl

Step 3 Go to the ceph-ansible directory. cd /home/ceph-ansible/

Step 4 Install the required dependencies. yum install -y python-netaddr

Step 5 Check and install the required software versions. pip install -r requirements.txt

----End

Solving Environment Dependency Problems

Step 1 Install dependencies on all nodes. yum install -y yum-plugin-priorities

Step 2 Solve the rpm_key dependency problem. 1. Open the redhat_community_repository.yml file. vim /home/ceph-ansible/roles/ceph-common/tasks/installs/redhat_community_repository.yml 2. Comment out the following code:

Step 3 Solve the Grafana dependency problem. 1. Open the configure_grafana.yml file. vim /home/ceph-ansible/roles/ceph-grafana/tasks/configure_grafana.yml 2. Comment out the following code:

----End

Creating the Service Node Configuration List Create the hosts file in the ceph-ansible directory.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 14 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

vi /home/ceph-ansible/hosts Add the following information to the file:

This operation is used to define hosts in the cluster and the role of each host in the Ceph cluster. You can deploy applications on cluster nodes based on the requirements of the entire cluster. Table 1-5 describes the parameters in the hosts file.

Table 1-5 Parameters in the hosts file Parameter Description

[mons] Specifies Monitor (MON) nodes.

[mgrs] Specifies Manager (MGR) nodes.

[osds] Specifies Object Storage Daemon (OSD) nodes, which are configured in the osds.yml file.

[mdss] Specifies metadata server (MDS) nodes. This parameter must be specified for the file storage.

[rgws] Specifies RADOS gateway (RGW) nodes. This parameter must be specified for the object storage.

[clients] Specifies client nodes.

[grafana-server] Generally, this parameter cannot be deleted. Otherwise, an error will be reported indicating that the parameter is missing.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 15 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Modifying the ceph-ansible Configuration File During Ansible deployment, the appropriate playbook must be transferred to the ansible-playbook command. Change the playbook name and modify the corresponding content to meet the cluster deployment requirements. Use the Ansible variables provided by ceph-ansible to configure the Ceph cluster. All options and default configurations are stored in the group_vars directory. Each Ceph process corresponds to a configuration file.

cd /home/ceph-ansible/group_vars/ cp mons.yml.sample mons.yml cp mgrs.yml.sample mgrs.yml cp mdss.yml.sample mdss.yml cp rgws.yml.sample rgws.yml cp osds.yml.sample osds.yml cp clients.yml.sample clients.yml cp all.yml.sample all.yml

NO TE

all.yml.sample is a special configuration file that applies to all hosts in a cluster.

Adding parameters in the ceph.conf file The ceph_conf_overrides variable in the all.yml file can be used to overwrite or add configuration in the ceph.conf file. Open the all.yml file.

vim all.yml Modify the configuration of ceph_conf_overrides: ceph_conf_overrides: global: osd_pool_default_pg_num: 64 osd_pool_default_pgp_num: 64 osd_pool_default_size: 2 mon: mon_allow_pool_create: true

Defining OSDs You can configure OSD drives, write-ahead logging (WAL) drives, and database (DB) drives in the osds.yml file by using two methods. Method 1: Running the ceph-volume lvm batch command In the osds.yml file under the groups_vars directory, set devices, dedicated_devices, and bluestore_wal_devices as required.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 16 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

● devices indicate the data drives in the system. This parameter can be used together with osds_per_device to set the number of OSDs created on each drive. ● dedicated_devices specifies DB drives. ● bluestore_wal_devices specifies WAL drives. ● The drives specified by dedicated_devices and bluestore_wal_devices must be different. ● dedicated_devices and bluestore_wal_devices can contain one or more drives. ● The system evenly divides the drives specified by dedicated_devices and bluestore_wal_devices based on the number of drives specified by devices and the value of osds_per_device.

NO TICE

The DB drive size of a single OSD must be greater than or equal to 50 GB. Otherwise, an error may be reported during the installation because the DB drive size is too small.

Method 2: Running the ceph-volume command to create OSDs from the LVs In the osds.yml file under the groups_vars directory, specify all OSD devices in the lvm_volumes attribute as required. The following scenarios are involved: ● Specify only data drives. ● Specify data drives and WAL drives. ● Specify data drives, WAL drives, and DB drives. ● Specify data drives and DB drives. Before using this mode, create the corresponding LVs on all cluster nodes.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 17 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

a. Create the corresponding volume group (VG). /usr/sbin/vgcreate --yes -s 1G ceph-data /dev/sdd

NO TE

The command is used to create a VG named ceph-data. -s specifies the basic unit for creating an LV. b. Create an LV based on the VG. /usr/sbin/lvcreate -l 300 -n osd-data2 ceph-data

NO TE

The command is used to create on ceph-data an LV named osd-data2 whose size is 300 GB. . Check the LV creation. lvs [root@ceph1 ceph-ansible-stable-4.0.bak]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert home -wi-ao---- 839.05g root centos -wi-ao---- 50.00g swap centos -wi-a----- 4.00g osd-data1 ceph-data -wi-ao---- 300.00g osd-data2 ceph-data -wi-ao---- 300.00g osd-db1 ceph-db -wi-ao---- 100.00g osd-db2 ceph-db -wi-ao---- 100.00g osd-wal1 ceph-wal -wi-ao---- 50.00g osd-wal2 ceph-wal -wi-ao---- 50.00g

The following figure shows how to create two OSDs, each of which contains DB and WAL drives.

1.5 Configuring Block Storage

Defining Ceph Service Requirements

Select the services required by the cluster. The services required in the block storage scenario include MONs and OSDs.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 18 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

cd .. cp site.yml.sample site.yml vim site.yml

Defining the Cluster Host Service

Open the hosts file in the ceph-ansible directory.

vi hosts

Modify the applications on each node in the cluster as required.

Defining the Ceph Cluster Configuration

Modify the all.yml file in the group_vars directory, including:

1. Ceph download mode and version information 2. Basic network information 3. OSD type

Perform the following steps:

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 19 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Step 1 Open the all.yml file. vim group_vars/all.yml Step 2 Modify the configuration in the file as follows: ● Online Ceph download mode ceph_origin: repository ceph_repository: community ceph_mirror: http://download.ceph.com ceph_stable_release: nautilus ceph_stable_repo: "{{ ceph_mirror }}/rpm-{{ ceph_stable_release }}" ceph_stable_redhat_distro: el7 monitor_interface: enp133s0 journal_size: 5120 public_network: 172.19.106.0/0 cluster_network: 172.19.106.0/0 osd_objectstore: bluestore Table 1-6 describes the parameters.

Table 1-6 Parameter description Parameter Description

monitor_interface Network interface device ID.

public_network Public network IP address and mask.

cluster_network Cluster network IP address and mask.

● Offline Ceph download mode ceph_origin: distro ceph_repository: local ceph_stable_release: nautilus

----End

Defining OSDs Define the required OSDs by referring to Defining OSDs. After these operations are performed, the block storage deployment environment is ready.

1.6 Configuring File Storage

Defining Ceph Service Requirements

Step 1 Open the site.yml file. vim site.yml Step 2 Add the mdss option, as shown in the following figure.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 20 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

----End

Defining the Cluster Host Service Open the hosts file in the ceph-ansible directory.

vi hosts

Modify the applications on each node in the cluster as required.

Defining the Ceph Cluster Configuration

Step 1 Open the all.yml file. vim all.yml

Step 2 Search for the "CephFS" keyword and modify the configuration as follows: ● Online Ceph download mode ceph_origin: repository ceph_repository: community ceph_mirror: http://download.ceph.com ceph_stable_release: nautilus

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 21 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

ceph_stable_repo: "{{ ceph_mirror }}/rpm-{{ ceph_stable_release }}" ceph_stable_redhat_distro: el7 monitor_interface: enp133s0 journal_size: 5120 public_network: 172.19.106.0/0 cluster_network: 172.19.106.0/0 osd_objectstore: bluestore

# CEPHFS # ########## cephfs: cephfs # name of the ceph filesystem cephfs_data_pool: name: "{{ cephfs_data if cephfs_data is defined else 'cephfs_data' }}" pg_num: "{{ osd_pool_default_pg_num }}" pgp_num: "{{ osd_pool_default_pg_num }}" rule_name: "replicated_rule" type: 1 # erasure_profile: "" # expected_num_objects: "" application: "cephfs" size: "{{ osd_pool_default_size }}" min_size: "{{ osd_pool_default_min_size }}" cephfs_metadata_pool: name: "{{ cephfs_metadata if cephfs_metadata is defined else 'cephfs_metadata' }}" pg_num: "{{ osd_pool_default_pg_num }}" pgp_num: "{{ osd_pool_defaultpg_num }}" rule_name: "replicated_rule" type: 1 # erasure_profile: "" # expected_num_objects: "" application: "cephfs" size: "{{ osd_pool_default_size }}" min_size: "{{ osd_pool_default_min_size }}" cephfs_pools: - "{{ cephfs_data_pool }}" - "{{ cephfs_metadata_pool }}"

NO TE

monitor_interface indicates the network interface device ID of the public network. ● Offline Ceph download mode ceph_origin: distro ceph_repository: local ceph_stable_release: nautilus

----End

Defining OSDs Define the required OSDs by referring to Defining OSDs.

1.7 Configuring Object Storage

Defining Ceph Service Requirements

Step 1 Open the site.yml file. vim site.yml Step 2 Add the definition of rgws under the hosts option. - hosts: - mons - osds # - mdss

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 22 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

- rgws # - nfss # - rbdmirrors - clients - mgrs # - iscsigws # - iscsi-gws # for backward compatibility only! # - grafana-server # - rgwloadbalancers

----End

Defining the Cluster Host Service Define the nodes where the RGW service is located. In this example, the nodes are ceph 1 and ceph 2. [mons] ceph1

[mgrs] ceph1

[osds] ceph1 ceph2

#[mdss] #ceph1 #ceph2

[rgws] ceph1 ceph2

[clients] client3

#[grafana-server] #ceph1

Defining the Ceph Cluster Configuration In the all.yml file, set the front-end network type, port, and number of RGW instances on each RGW node.

● Online Ceph download mode ceph_origin: repository ceph_repository: community ceph_mirror: http://download.ceph.com ceph_stable_release: nautilus ceph_stable_repo: "{{ ceph_mirror }}/rpm-{{ ceph_stable_release }}" ceph_stable_redhat_distro: el7 monitor_interface: enp133s0 journal_size: 5120 public_network: 172.19.106.0/0 cluster_network: 172.19.106.0/0

osd_objectstore: bluestore

## Rados Gateway options radosgw_frontend_type: beast radosgw_frontend_port: 12345 radosgw_interface: "{{monitor_interface}}" radosgw_num_instances: 3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 23 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Table 1-7 describes the parameters.

Table 1-7 Parameter description Parameter Description

monitor_interface Network interface device ID of the public network.

radosgw_frontend_type RGW front-end type. This parameter is optional.

radosgw_frontend_port RGW front-end port. This parameter is optional.

radosgw_num_instances Number of RGW instances on each node. RGW access ports are numbered incrementally based on the port number specified by radosgw_frontend_port.

● Offline Ceph download mode ceph_origin: distro ceph_repository: local ceph_stable_release: nautilus

Setting the Permission of the RGW Service In the rgws.yml file, set pg_num and size of the default RGW data pool and index pool, and specify whether the RGW service can access private devices. rgw_create_pools: defaults.rgw.buckets.data: pg_num: 8 size: "" defaults.rgw.buckets.index: pg_num: 8 size: "" ########### # SYSTEMD # ########### # ceph_rgw_systemd_overrides will override the systemd settings # for the ceph-rgw services. # For example,to set "PrivateDevices=false" you can specify: ceph_rgw_systemd_overrides: Service: PrivateDevices: False

Defining OSDs Define the required OSDs by referring to Defining OSDs.

1.8 Deploying the Ceph Cluster Step 1 Deploy the Ceph cluster. ansible-playbook -i hosts site.yml If "failed=0" is displayed for all nodes (as shown in the following figure) after the command is executed, the deployment is in progress.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 24 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Figure 1-3 Deployment result

Step 2 Check the cluster health status. If "HEALTH_OK" is displayed, the cluster is healthy. ceph health

Step 3 After the Ceph cluster is deployed using ceph-ansible, check whether configuration is written into the ceph.conf file. If so, the deployment is successful.

----End

1.9 Expanding the Cluster

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 25 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

1.9.1 Adding Ceph MON Nodes

Prerequisites 1. A Ceph cluster deployed using Ansible is running properly. 2. You have the root permission on the new nodes.

Procedure

Step 1 Add an MON node name to the [mons] section in the hosts file of the cluster.

Step 2 Ensure that Ansible can connect to the node. ansible all -i hosts -m ping

Step 3 Run the playbook. ● Method 1: Switch to the home directory of Ansible and run the site.yml playbook. ansible-playbook -i hosts site.yml ● Method 2: Copy infrastructure-playbook/add-mon.yml to the home directory and run the playbook. cp infrastructure-playbook/add-mon.yml add-mon.yml ansible-playbook -i hosts add-mon.yml

----End 1.9.2 Adding OSD Nodes

1.9.2.1 Adding Ceph OSDs with the Same Drive Topology as Existing OSDs

Ansible uses the group_vars/osds.yml file to configure OSDs.

Prerequisites 1. A Ceph cluster deployed using Ansible is running properly. 2. You have the root permission on the new nodes. 3. A node that has the same number of data drives as other OSD nodes in the cluster is available.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 26 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Procedure

Step 1 Copy the add-osd.yml file in the /ceph-ansible/infrastructure-playbooks/ directory to the home directory. cp ./infrastructure-playbooks/add-osd.yml ./add-osd.yml Step 2 Add an OSD node name to the [osds] section in the hosts file of the cluster.

Step 3 Ensure that Ansible can connect to the node. ansible all -i hosts -m ping Step 4 Run the playbook. ansible-playbook -i host add-osd.yml

----End

1.9.2.2 Adding Ceph OSDs with a Different Drive Topology from Existing OSDs

Prerequisites 1. A Ceph cluster deployed using Ansible is running properly. 2. You have the root permission on the new nodes.

Procedure

Step 1 Add an OSD node name to the [osds] section in the hosts file of the cluster, and enter the required device information under the new OSD node.

Step 2 Ensure that Ansible can connect to the node. ansible all -i hosts -m ping Step 3 Copy the add-osd.yml file to the home directory. cp ./infrastructure-playbooks/add-osd.yml ./add-osd.yml Step 4 Run the playbook. ansible-playbook -i host add-osd.yml

----End 1.9.3 Adding MDS Nodes

Prerequisites 1. A Ceph cluster deployed using Ansible is running properly. 2. A management node where Ansible is installed is available.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 27 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

Procedure

Step 1 Add an MDS node name to the [mdss] section in the hosts file of the cluster.

Step 2 Delete the comment sign before mdss in the site.yml file in the home directory.

Step 3 Run the playbook in the home directory. ansible-playbook -i hosts site.yml

----End

1.10 Deleting the Cluster

NO TE

Perform this operation only on ceph 1.

Step 1 Run the following command to delete the cluster: ansible-playbook -i hosts infrastructure-playbooks/purge-cluster.yml

Step 2 Type yes to confirm the deletion. Are you sure you want to purge the cluster? [no]:yes

----End

1.11 More Resources

Creating a Repo Source Package

Step 1 Run the reposync command to download all files of the EPEL Repo image source to the host in batches. 1. Download the files of the epel.repo image repository. reposync -r epel -p /opt/EPEL 2. Download the files of the CentOS-Base.repo image repository. reposync -r CentOS-Base -p /opt/CentOS-Base 3. Download the files of the ceph.repo image repository. reposync -r ceph -p /opt/ceph

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 28 Kunpeng BoostKit for SDS Deployment Guides 1 Ceph-Ansible Deployment Guide (CentOS 7.6)

NO TE

In the preceding commands, /opt/epel, /opt/CentOS-Base, and /opt/ceph are the download destination directories. If no path is specified, the files are saved in the current directory. Step 2 Generate the repodata file in the /opt/EPEL, /opt/CentOS-Base, and /opt/ceph directories. yum install createrepo cd /opt/EPEL createrepo . cd /opt/CentOS-Base createrepo . cd /opt/ceph createrepo . Step 3 Compress the /opt/EPEL, /opt/CentOS-Base, and /opt/ceph directories into .zip files. zip -r EPEL.zip /opt/EPEL zip -r CentOS-Base.zip /opt/CentOS-Base zip -r ceph.zip /opt/ceph Step 4 Save the source image files (EPEL.zip, CentOS-Base.zip, and ceph.zip) to the local host by using a DVD-ROM or mobile storage medium to set up a local image repository.

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 29 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

2 Ceph Block Storage Deployment Guide (CentOS 7.6)

2.1 Introduction 2.2 Environment Requirements 2.3 Configuring the Deployment Environment 2.4 Installing Ceph 2.5 Verifying Ceph

2.1 Introduction

Overview Ceph is a distributed, scalable, reliable, and high-performance storage system platform that supports storage interfaces including block devices, file systems, and object gateways. Figure 2-1 shows the Ceph architecture. This document describes how to deploy Ceph. Before installing Ceph, disable the firewall, configure the hostname, configure the time service, configure password- free login, disable SELinux, and configure the software sources. Then, run yum commands to install Ceph and deploy the Monitor (MON), Manager (MGR), and Object Storage Daemon (OSD) nodes. Finally, verify Ceph to complete the deployment.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 30 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Figure 2-1 Ceph architecture

Table 2-1 describes the modules in the preceding figure.

Table 2-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 31 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version

14.2.10

2.2 Environment Requirements

Hardware Requirements

Table 2-2 lists the hardware requirements.

Table 2-2 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 32 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Cores 2 x 32-core

CPU frequency 2600 MHz

Memory capacity 8 x 16 GB

Memory frequency 2933 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drives System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID controller card LSI SAS3508

NO TE

The installation of Ceph and its dependencies requires Internet access. Ensure that the server is connected to the Internet.

Software Requirements Table 2-3 lists the software requirements.

Table 2-3 Required software versions Software CentOS

OS CentOS Linux release 7.6.1810

Ceph 14.2.10 Nautilus

ceph-deploy 2.0.1

NO TE

● This document uses Ceph 14.2.10 as an example. You can also refer to this document to install other versions. ● If Ceph is installed on the OS for the first time, you are not advised to use minimum installation. Otherwise, you may need to manually install many software packages. You can select the Server with GUI installation mode.

Cluster Environment Planning Figure 2-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 33 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Figure 2-2 Physical networking diagram

Table 2-4 lists the cluster deployment plan.

Table 2-4 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 2-5 lists the client deployment plan.

Table 2-5 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 34 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

Drive Partitioning Ceph 14.2.10 uses BlueStore as the back-end storage engine. The Journal partition in the Jewel version is no longer used. Instead, the DB partition (metadata partition) and WAL partition are used. Respectively, the two partitions store the metadata and log files generated by the BlueStore back end. In cluster deployment mode, each Ceph node is configured with twelve 4 TB data drives and one 3.2 TB NVMe drive. Each 4 TB data drive functions as the data partition of one OSD, and the NVMe drive functions as the DB and WAL partitions of the 12 OSDs. Generally, the WAL partition is sufficient if its capacity is greater than 10 GB. According to the official Ceph document, it is recommended that the size of each DB partition be at least 4% of the capacity of each data drive. The size of each DB partition can be flexibly configured based on the NVMe drive capacity. In this example, the WAL partition capacity is 60 GB and the DB partition capacity is 180 GB. Table 2-6 lists the partitions of one OSD.

Table 2-6 Drive partitions Data Drive DB Partition WAL Partition

4 TB 180 GB 60 GB

2.3 Configuring the Deployment Environment

Configuring the EPEL Source On each server node and client node, run the following command to configure the Extra Packages for Enterprise Linux (EPEL) source:

yum install epel-release -y

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 35 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Disabling the Firewall

On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way.

Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file: 192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Configuring NTP

Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 36 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit.

Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 37 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

Configuring Password-Free Login Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 38 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Disabling SELinux Disable SELinux on each server node and client node. ● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 39 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Configuring the Ceph Image Source

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/yum.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

----End

2.4 Installing Ceph

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 40 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

2.4.1 Installing the Ceph Software

NO TE

When you use yum install to install Ceph, the latest version will be installed by default. In this document, the latest version is Ceph V14.2.11. If you do not want to install the latest version, you can modify the /etc/yum.conf file. For example, you can perform the following operations to install Ceph V14.2.10: 1. Open the /etc/yum.conf file. vi /etc/yum.conf 2. Add the following information to the [main] section: exclude=*14.2.11* In this case, the 14.2.11 version is filtered out, and the latest version that can be installed becomes 14.2.10. Then, when you run the yum install command, Ceph 14.2.10 is installed. 3. Run the yum list ceph command to check the available version.

Step 1 Install Ceph on each server node and client node. yum -y install ceph

Step 2 Install ceph-deploy on ceph 1. yum -y install ceph-deploy

Step 3 Run the following command on each node to check the version: ceph -v The command output is similar to the following: ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 41 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

2.4.2 Deploying MON Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Create a cluster. cd /etc/ceph ceph-deploy new ceph1 ceph2 ceph3

Step 2 Configure mon_host, public network, and cluster network in the ceph.conf file that is automatically generated in /etc/ceph. vi /etc/ceph/ceph.conf Modify the content in the ceph.conf file as follows: [global] fsid = f6b3c38c-7241-44b3-b433-52e276dd53c6 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.166,192.168.3.167,192.168.3.168 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

NO TE

● Run the command for configuring nodes and use ceph-deploy to configure the OSD in the /etc/ceph directory. Otherwise, an error may occur. ● The modification is to isolate the internal cluster network from the external access network. 192.168.4.0 is used for data synchronization between internal storage clusters (used only between storage nodes), and 192.168.3.0 is used for data exchange between storage nodes and compute nodes. Step 3 Initialize the monitor and collect the keys. ceph-deploy mon create-initial

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 42 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 4 Copy ceph.client.admin.keyring to each node. ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3 client1 client2 client3

Step 5 Check whether the configuration is successful. ceph -s The configuration is successful if the command output is similar to the following:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h)

----End 2.4.3 Deploying MGR Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Deploy MGR nodes. ceph-deploy mgr create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 43 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 2 Check whether the MGR nodes are successfully deployed. ceph -s

The MGR nodes are successfully deployed if the command output is similar to the following: cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph1(active, since 2d), standbys: ceph2, ceph3

----End 2.4.4 Deploying OSD Nodes

Creating OSD Partitions

NO TE

Perform the following operations on the three Ceph nodes. The following uses /dev/ nvme0n1 as an example. If the system has multiple NVMe SSDs or SATA/SAS SSDs, you only need to change /dev/nvme0n1 to the actual driver letters.

The NVMe SSD is divided into twelve 60 GB partitions and twelve 180 GB partitions, which correspond to the WAL and DB partitions respectively.

Step 1 Create the partition.sh script. vi partition.sh

Step 2 Add the following information to the script: #!/bin/bash

parted /dev/nvme0n1 mklabel gpt

for j in `seq 1 12` do ((b = $(( $j * 8 )))) ((a = $(( $b - 8 )))) ((c = $(( $b - 6 )))) str="%" echo $a echo $b echo $c parted /dev/nvme0n1 mkpart primary ${a}${str} ${c}${str} parted /dev/nvme0n1 mkpart primary ${c}${str} ${b}${str} done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 44 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. Step 3 Run the script. bash partition.sh

----End

Deploying OSD Nodes NO TE

In the following script, the 12 drives /dev/sda to /dev/sdl are data drives, and the OS is installed on /dev/sdm. However, if the data drives are not numbered consecutively, for example, the OS is installed on /dev/sde, you cannot run the script directly. Otherwise, an error will be reported during the deployment on /dev/sde. Instead, you need to modify the script to ensure that only data drives are operated and other drives such as the OS drive and SSD drive for DB and WAL partitions are not operated.

Step 1 Run the following command to check the drive letter of each drive on each node. lsblk

As shown in the preceding figure, /dev/sda is the OS drive. NO TE

The drives that were ever used as OS drives and data drives in a Ceph cluster may have residual partitions. You can run the lsblk command to check for the drive partitions. For example, if /dev/sdb has partitions, run the following command to clear the partitions: ceph-volume lvm zap /dev/sdb --destroy

CA UTION

You must determine the data drives first, and then run the destroy command only when the data drives have residual partitions.

Step 2 Create the create_osd.sh script on ceph 1 and deploy the OSD node on the 12 drives on each server. cd /etc/ceph/ vi /etc/ceph/create_osd.sh Add the following information to the script:

#!/bin/bash

for node in ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 45 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

do j=1 k=2 for i in {a..l} do ceph-deploy osd create ${node} --data /dev/sd${i} --block-wal /dev/nvme0n1p${j} --block-db /dev/ nvme0n1p${k} ((j=${j}+2)) ((k=${k}+2)) sleep 3 done done

NO TE

● This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. ● In the ceph-deploy osd create command: – ${node} specifies the hostname of the node. – --data specifies the data drive. – --block-db specifies the DB partition. – --block-wal specifies the WAL partition. DB and WAL partitions are usually deployed on NVMe SSDs to improve write performance. If no NVMe SSD is configured or NVMe SSDs are used as data drives, you do not need to specify --block-db and --block-wal. You only need to specify --data.

Step 3 Run the script on ceph 1. bash create_osd.sh

Step 4 Check whether all 36 OSD nodes are in the up state. ceph -s

----End

2.5 Verifying Ceph

2.5.1 Creating a Storage Pool

NO TE

● Ceph 14.2.1 and later versions have no default storage pool. You need to create a storage pool and then create block devices in the storage pool. ● Perform operations in this section only on ceph 1.

Step 1 Create a storage pool, for example, vdbench. cd /etc/ceph ceph osd pool create vdbench 1024 1024

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 46 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

NO TE

● In the preceding command, vdbench is the storage pool name, and the two 1024 numbers are the placement group (PG) quantity and placement group for placement purpose (PGP) quantity. ● The two 1024 numbers in the storage pool creation command (for example, ceph osd pool create vdbench 1024 1024) respectively correspond to the pg_num and pgp_num parameters of the storage pool. According to the official Ceph document, the recommended total number of storage pool PGs in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of replicas. For the erasure code (EC) mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replia mode and 6 for the EC4+2 mode. ● In this example, the server is composed of three servers. Each server has 12 OSDs, and there are 36 OSDs in total. According to the preceding formula, the PG quantity is 1200. It is recommended that the PG quantity be the integral power of 2. Therefore, the PG quantity of vdbench is 1024. Step 2 After a storage pool is created for Ceph 14.2.10, you need to specify the pool type (CephFS, RBD, or RGW). The following use the block storage mode as an example. ceph osd pool application enable vdbench rbd

NO TE

● vdbench is the storage pool name and rbd is the storage pool type. ● You can add --yes-i-really-mean-it to the end of the command to change the storage pool type. Step 3 (Optional) Enable zlib compression for the storage pool. ceph osd pool set vdbench compression_algorithm zlib ceph osd pool set vdbench compression_mode force ceph osd pool set vdbench compression_required_ratio .99

NO TE

This step enables OSD compression. Skip this step if OSD compression is not required.

----End 2.5.2 Creating Block Devices

NO TE

Perform operations in this section only on ceph 1.

Step 1 Run a script to create 30 block devices in the RBD storage pool. The size of each block device is 200 GB.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 47 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

vi create_image.sh Step 2 Add the following information to the script: #!/bin/bash pool="vdbench" size="204800"

createimages() { for image in {1..30} do rbd create image${image} --size ${size} --pool ${pool} --image-format 2 --image-feature layering sleep 1 done } createimages

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. Step 3 Run the script. bash create_image.sh Step 4 Check whether the creation is successful. rbd ls --pool vdbench The creation is successful if the command output contains image1, image2, image3, ..., image29, and image30.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 48 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

NO TE

In the preceding command, --pool specifies the storage pool name and is used for viewing the images in this storage pool.

----End 2.5.3 Mapping Block Device Images

NO TE

● Perform operations in this section only on ceph 1. The script mentioned in the following operations logs in to client 1, client 2, and client 3 to map the 30 images created in the previous section to the three clients (10 RBDs for each client). ● Block storage images have been created upon the creation of images. The following operations map the images as local block devices. You can determine whether to perform operations in this section based on the actual requirements.

Step 1 Run a script to create 30 block devices in the RBD storage pool. The size of each block device is 200 GB. vi map_image.sh

Step 2 Add the following information to the script: #!/bin/bash pool="vdbench"

mapimages() { for i in {1..10} do ssh client1 "rbd map ${pool}/image${i}" done

for i in {11..20} do ssh client2 "rbd map ${pool}/image${i}" done

for i in {21..30} do ssh client3 "rbd map ${pool}/image${i}" done

} mapimages

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script.

Step 3 Run the script. bash map_image.sh

Step 4 Log in to client 1, client 2, and client 3, respectively, and run the following command to check whether the creation is successful: ls /dev | grep rbd

If rbd0, rbd2, rbd3, ..., rbd8, and rbd9 are displayed in the command output, the creation is successful.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 49 Kunpeng BoostKit for SDS 2 Ceph Block Storage Deployment Guide (CentOS Deployment Guides 7.6)

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 50 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

3 Ceph Block Storage Deployment Guide (openEuler 20.03)

3.1 Introduction 3.2 Environment Requirements 3.3 Configuring the Deployment Environment 3.4 Installing Ceph 3.5 Verifying Ceph

3.1 Introduction

Overview Ceph is a distributed, scalable, reliable, and high-performance storage system platform that supports storage interfaces including block devices, file systems, and object gateways. Figure 3-1 shows the Ceph architecture. This document describes how to deploy Ceph. Before installing Ceph, disable the firewall, configure the hostname, configure the time service, configure password- free login, disable SELinux, and configure the software sources. Then, run yum commands to install Ceph and deploy the Monitor (MON), Manager (MGR), and Object Storage Daemon (OSD) nodes. Finally, verify Ceph to complete the deployment.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 51 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Figure 3-1 Ceph architecture

Table 3-1 describes the modules in the preceding figure.

Table 3-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 52 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Operating System Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version

14.2.10

3.2 Environment Requirements

Hardware Requirements

Table 3-2 lists the hardware requirements.

Table 3-2 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 53 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Cores 2 x 32-core

CPU frequency 2600 MHz

Memory capacity 8 x 16 GB

Memory frequency 2933 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drives System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID controller card LSI SAS3508

NO TE

The installation of Ceph and its dependencies requires Internet access. Ensure that the server is connected to the Internet.

Software Requirements Table 3-3 lists the software requirements.

Table 3-3 Required software versions Software Version

OS openEuler 20.03

Ceph 14.2.10 Nautilus

ceph-deploy 2.0.1

NO TE

● This document uses Ceph 14.2.10 as an example. You can also refer to this document to install other versions. ● If Ceph is installed on the OS for the first time, you are not advised to use minimum installation. Otherwise, you may need to manually install many software packages. You can select the Server with GUI installation mode.

Cluster Environment Planning Figure 3-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 54 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Figure 3-2 Physical networking diagram

Table 3-4 lists the cluster deployment plan.

Table 3-4 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 3-5 lists the client deployment plan.

Table 3-5 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 55 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

Drive Partitioning Ceph 14.2.10 uses BlueStore as the back-end storage engine. The Journal partition in the Jewel version is no longer used. Instead, the DB partition (metadata partition) and WAL partition are used. Respectively, the two partitions store the metadata and log files generated by the BlueStore back end. In cluster deployment mode, each Ceph node is configured with twelve 4 TB data drives and one 3.2 TB NVMe drive. Each 4 TB data drive functions as the data partition of one OSD, and the NVMe drive functions as the DB and WAL partitions of the 12 OSDs. Generally, the WAL partition is sufficient if its capacity is greater than 10 GB. According to the official Ceph document, it is recommended that the size of each DB partition be at least 4% of the capacity of each data drive. The size of each DB partition can be flexibly configured based on the NVMe drive capacity. In this example, the WAL partition capacity is 60 GB and the DB partition capacity is 180 GB. Table 3-6 lists the partitions of one OSD.

Table 3-6 Drive partitions Data Drive DB Partition WAL Partition

4 TB 180 GB 60 GB

3.3 Configuring the Deployment Environment

Configuring the EPEL Source Perform the following operations on each server node and client node to configure the Extra Packages for Enterprise Linux (EPEL) source:

Step 1 Upload the everything image source file corresponding to the OS to the server. Use the SFTP tool to upload the openEuler-***-everything-aarch64-dvd.iso package to the /root directory on the server. Step 2 Create a local folder to mount the image. mkdir -p /iso Step 3 Mount the ISO file to the local directory. mount /root/openEuler-***-everything-aarch64-dvd.iso /iso

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 56 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 4 Create a Yum source for the image. vi /etc/yum.repos.d/openEuler.repo

Add the following information to the file: [Base] name=Base baseurl=file:///iso enabled=1 gpgcheck=0 priority=1

Add an external image source. [arch_fedora_online] name=arch_fedora baseurl=https://mirrors.huaweicloud.com/fedora/releases/30/Everything/aarch64/os/ enabled=1 gpgcheck=0 priority=2

----End

Disabling the Firewall On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way. Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file:

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 57 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Configuring NTP Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit. Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 58 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

Configuring Password-Free Login Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 59 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Disabling SELinux

Disable SELinux on each server node and client node.

● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 60 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Configuring the Ceph Image Source

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/yum.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 61 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

----End

Changing the umask Value Change the value of umask to 0022 so that Ceph can be properly installed.

Step 1 Open the bashrc file in all clusters. vi /etc/bashrc Change the last line to umask 0022. Step 2 Make the configuration take effect. source /etc/bashrc Step 3 Run the following command to check whether the configuration has taken effect: umask

----End

3.4 Installing Ceph

3.4.1 Installing the Ceph Software

NO TE

When you use yum install to install Ceph, the latest version will be installed by default. In this document, the latest version is Ceph V14.2.11. If you do not want to install the latest version, you can modify the /etc/yum.conf file. For example, you can perform the following operations to install Ceph V14.2.10: 1. Open the /etc/yum.conf file. vi /etc/yum.conf 2. Add the following information to the [main] section: exclude=*14.2.11* In this case, the 14.2.11 version is filtered out, and the latest version that can be installed becomes 14.2.10. Then, when you run the yum install command, Ceph 14.2.10 is installed. 3. Run the yum list ceph command to check the available version.

Step 1 Install Ceph on each cluster node and client node.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 62 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

dnf -y install ceph

Step 2 Install ceph-deploy on ceph 1. pip install ceph-deploy pip install prettytable

Step 3 Add a line of code to the _get_distro function in the /lib/python2.7/site- packages/ceph_deploy/hosts/__init__.py file to adapt the software to the openEuler system. vi /lib/python2.7/site-packages/ceph_deploy/hosts/__init__.py 'openeuler': fedora,

Step 4 Run the following command on each node to check the version: ceph -v The command output is similar to the following:

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 63 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

----End 3.4.2 Deploying MON Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Create a cluster. cd /etc/ceph ceph-deploy new ceph1 ceph2 ceph3

Step 2 Configure mon_host, public network, and cluster network in the ceph.conf file that is automatically generated in /etc/ceph. vi /etc/ceph/ceph.conf

Modify the content in the ceph.conf file as follows: [global] fsid = f6b3c38c-7241-44b3-b433-52e276dd53c6 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.166,192.168.3.167,192.168.3.168 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

NO TE

● Run the command for configuring nodes and use ceph-deploy to configure the OSD in the /etc/ceph directory. Otherwise, an error may occur. ● The modification is to isolate the internal cluster network from the external access network. 192.168.4.0 is used for data synchronization between internal storage clusters (used only between storage nodes), and 192.168.3.0 is used for data exchange between storage nodes and compute nodes.

Step 3 Initialize the monitor and collect the keys. ceph-deploy mon create-initial

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 64 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 4 Copy ceph.client.admin.keyring to each node. ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3 client1 client2 client3

Step 5 Check whether the configuration is successful. ceph -s The configuration is successful if the command output is similar to the following:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h)

----End 3.4.3 Deploying MGR Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Deploy MGR nodes. ceph-deploy mgr create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 65 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 2 Check whether the MGR nodes are successfully deployed. ceph -s

The MGR nodes are successfully deployed if the command output is similar to the following: cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph1(active, since 2d), standbys: ceph2, ceph3

----End 3.4.4 Deploying OSD Nodes

Creating OSD Partitions

NO TE

Perform the following operations on the three Ceph nodes. The following uses /dev/ nvme0n1 as an example. If the system has multiple NVMe SSDs or SATA/SAS SSDs, you only need to change /dev/nvme0n1 to the actual driver letters.

The NVMe SSD is divided into twelve 60 GB partitions and twelve 180 GB partitions, which correspond to the WAL and DB partitions respectively.

Step 1 Create the partition.sh script. vi partition.sh

Step 2 Add the following information to the script: #!/bin/bash

parted /dev/nvme0n1 mklabel gpt

for j in `seq 1 12` do ((b = $(( $j * 8 )))) ((a = $(( $b - 8 )))) ((c = $(( $b - 6 )))) str="%" echo $a echo $b echo $c parted /dev/nvme0n1 mkpart primary ${a}${str} ${c}${str} parted /dev/nvme0n1 mkpart primary ${c}${str} ${b}${str} done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 66 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. Step 3 Run the script. bash partition.sh

----End

Deploying OSD Nodes NO TE

In the following script, the 12 drives /dev/sda to /dev/sdl are data drives, and the OS is installed on /dev/sdm. However, if the data drives are not numbered consecutively, for example, the OS is installed on /dev/sde, you cannot run the script directly. Otherwise, an error will be reported during the deployment on /dev/sde. Instead, you need to modify the script to ensure that only data drives are operated and other drives such as the OS drive and SSD drive for DB and WAL partitions are not operated.

Step 1 Run the following command to check the drive letter of each drive on each node. lsblk

As shown in the preceding figure, /dev/sda is the OS drive. NO TE

The drives that were ever used as OS drives and data drives in a Ceph cluster may have residual partitions. You can run the lsblk command to check for the drive partitions. For example, if /dev/sdb has partitions, run the following command to clear the partitions: ceph-volume lvm zap /dev/sdb --destroy

CA UTION

You must determine the data drives first, and then run the destroy command only when the data drives have residual partitions.

Step 2 Create the create_osd.sh script on ceph 1 and deploy the OSD node on the 12 drives on each server. cd /etc/ceph/ vi /etc/ceph/create_osd.sh Add the following information to the script:

#!/bin/bash

for node in ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 67 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

do j=1 k=2 for i in {a..l} do ceph-deploy osd create ${node} --data /dev/sd${i} --block-wal /dev/nvme0n1p${j} --block-db /dev/ nvme0n1p${k} ((j=${j}+2)) ((k=${k}+2)) sleep 3 done done

NO TE

● This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. ● In the ceph-deploy osd create command: – ${node} specifies the hostname of the node. – --data specifies the data drive. – --block-db specifies the DB partition. – --block-wal specifies the WAL partition. DB and WAL partitions are usually deployed on NVMe SSDs to improve write performance. If no NVMe SSD is configured or NVMe SSDs are used as data drives, you do not need to specify --block-db and --block-wal. You only need to specify --data.

Step 3 Run the script on ceph 1. bash create_osd.sh

Step 4 Check whether all 36 OSD nodes are in the up state. ceph -s

----End

3.5 Verifying Ceph

3.5.1 Creating a Storage Pool

NO TE

● Ceph 14.2.1 and later versions have no default storage pool. You need to create a storage pool and then create block devices in the storage pool. ● Perform operations in this section only on ceph 1.

Step 1 Create a storage pool, for example, vdbench. cd /etc/ceph ceph osd pool create vdbench 1024 1024

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 68 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

NO TE

● In the preceding command, vdbench is the storage pool name, and the two 1024 numbers are the placement group (PG) quantity and placement group for placement purpose (PGP) quantity. ● The two 1024 numbers in the storage pool creation command (for example, ceph osd pool create vdbench 1024 1024) respectively correspond to the pg_num and pgp_num parameters of the storage pool. According to the official Ceph document, the recommended total number of storage pool PGs in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of replicas. For the erasure code (EC) mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replia mode and 6 for the EC4+2 mode. ● In this example, the server is composed of three servers. Each server has 12 OSDs, and there are 36 OSDs in total. According to the preceding formula, the PG quantity is 1200. It is recommended that the PG quantity be the integral power of 2. Therefore, the PG quantity of vdbench is 1024. Step 2 After a storage pool is created for Ceph 14.2.10, you need to specify the pool type (CephFS, RBD, or RGW). The following use the block storage mode as an example. ceph osd pool application enable vdbench rbd

NO TE

● vdbench is the storage pool name and rbd is the storage pool type. ● You can add --yes-i-really-mean-it to the end of the command to change the storage pool type. Step 3 (Optional) Enable zlib compression for the storage pool. ceph osd pool set vdbench compression_algorithm zlib ceph osd pool set vdbench compression_mode force ceph osd pool set vdbench compression_required_ratio .99

NO TE

This step enables OSD compression. Skip this step if OSD compression is not required.

----End 3.5.2 Creating Block Devices

NO TE

Perform operations in this section only on ceph 1.

Step 1 Run a script to create 30 block devices in the RBD storage pool. The size of each block device is 200 GB.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 69 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

vi create_image.sh Step 2 Add the following information to the script: #!/bin/bash pool="vdbench" size="204800"

createimages() { for image in {1..30} do rbd create image${image} --size ${size} --pool ${pool} --image-format 2 --image-feature layering sleep 1 done } createimages

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. Step 3 Run the script. bash create_image.sh Step 4 Check whether the creation is successful. rbd ls --pool vdbench The creation is successful if the command output contains image1, image2, image3, ..., image29, and image30.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 70 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

NO TE

In the preceding command, --pool specifies the storage pool name and is used for viewing the images in this storage pool.

----End 3.5.3 Mapping Block Device Images

NO TE

● Perform operations in this section only on ceph 1. The script mentioned in the following operations logs in to client 1, client 2, and client 3 to map the 30 images created in the previous section to the three clients (10 RBDs for each client). ● Block storage images have been created upon the creation of images. The following operations map the images as local block devices. You can determine whether to perform operations in this section based on the actual requirements.

Step 1 Run a script to create 30 block devices in the RBD storage pool. The size of each block device is 200 GB. vi map_image.sh

Step 2 Add the following information to the script: #!/bin/bash pool="vdbench"

mapimages() { for i in {1..10} do ssh client1 "rbd map ${pool}/image${i}" done

for i in {11..20} do ssh client2 "rbd map ${pool}/image${i}" done

for i in {21..30} do ssh client3 "rbd map ${pool}/image${i}" done

} mapimages

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script.

Step 3 Run the script. bash map_image.sh

Step 4 Log in to client 1, client 2, and client 3, respectively, and run the following command to check whether the creation is successful: ls /dev | grep rbd

If rbd0, rbd2, rbd3, ..., rbd8, and rbd9 are displayed in the command output, the creation is successful.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 71 Kunpeng BoostKit for SDS 3 Ceph Block Storage Deployment Guide (openEuler Deployment Guides 20.03)

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 72 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

4 Ceph Object Storage Deployment Guide (CentOS 7.6)

4.1 Introduction 4.2 Environment Requirements 4.3 Configuring the Deployment Environment 4.4 Installing Ceph 4.5 Verifying Ceph

4.1 Introduction

Overview Ceph is a distributed, scalable, reliable, and high-performance storage system platform that supports storage interfaces including block devices, file systems, and object gateways. Figure 4-1 shows the Ceph architecture. This document describes how to deploy Ceph. Before installing Ceph, disable the firewall, configure the hostname, configure the time service, configure password- free login, disable SELinux, and configure the software sources. Then, run yum commands to install Ceph and deploy the Monitor (MON), Manager (MGR), and Object Storage Daemon (OSD) nodes. Finally, verify Ceph to complete the deployment.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 73 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Figure 4-1 Ceph architecture

Table 4-1 describes the modules in the preceding figure.

Table 4-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 74 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Operating System Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version

14.2.10

4.2 Environment Requirements

Hardware Requirements

Table 4-2 lists the hardware requirements.

Table 4-2 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 75 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Cores 2 x 32-core

CPU frequency 2600 MHz

Memory capacity 8 x 16 GB

Memory frequency 2933 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drives System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID controller card LSI SAS3508

NO TE

The installation of Ceph and its dependencies requires Internet access. Ensure that the server is connected to the Internet.

Software Requirements Table 4-3 lists the software requirements.

Table 4-3 Required software versions Software CentOS

OS CentOS Linux release 7.6.1810

Ceph 14.2.10 Nautilus

ceph-deploy 2.0.1

NO TE

● This document uses Ceph 14.2.10 as an example. You can also refer to this document to install other versions. ● If Ceph is installed on the OS for the first time, you are not advised to use minimum installation. Otherwise, you may need to manually install many software packages. You can select the Server with GUI installation mode.

Cluster Environment Planning Figure 4-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 76 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Figure 4-2 Physical networking diagram

Table 4-4 lists the cluster deployment plan.

Table 4-4 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 4-5 lists the client deployment plan.

Table 4-5 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 77 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

Drive Partitioning

Ceph 14.2.10 uses BlueStore as the back-end storage engine. The Journal partition in the Jewel version is no longer used. Instead, the DB partition (metadata partition) and WAL partition are used. The two partitions store back-end metadata and log files generated by BlueStore. The metadata is used to improve the efficiency of the entire storage system, and the logs are used to maintain system stability.

In cluster deployment mode, each Ceph node is configured with twelve 4 TB data drives and one 3.2 TB NVMe drive. Each 4 TB data drive functions as the data partition of an Object Storage Daemon (OSD), and the NVMe drive functions as the DB and WAL partitions of the 12 OSDs. Generally, the WAL partition is sufficient if its capacity is greater than 10 GB. According to the official Ceph document, it is recommended that the size of each DB partition be at least 4% of the capacity of each data drive. The size of each DB partition can be flexibly configured based on the NVMe drive capacity. In this solution, the NVMe drive is divided into 27 partitions. Twelve 20 GB partitions are used as WAL, and twelve 45 GB partitions are used as DB. To improve the object storage performance, each NVMe drive has three 700 GB partitions as the storage pool for storing RGW metadata.

Table 4-6 lists the partitions of an NVMe drive.

Table 4-6 Partitions of an NVMe drive

NVMe Drive DB Partition WAL Partition Metadata Storage Pool Partition

3.2 TB 12 x 45 GB 12 x 20 GB 3 x 700 GB

4.3 Configuring the Deployment Environment

Configuring the EPEL Source

On each server node and client node, run the following command to configure the Extra Packages for Enterprise Linux (EPEL) source:

yum install epel-release -y

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 78 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Disabling the Firewall

On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way.

Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file: 192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 79 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Configuring NTP Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit. Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 80 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

Configuring Password-Free Login Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 81 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Disabling SELinux Disable SELinux on each server node and client node. ● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 82 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Configuring the Ceph Image Source

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/yum.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

----End

4.4 Installing Ceph

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 83 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

4.4.1 Installing the Ceph Software

NO TE

When you use yum install to install Ceph, the latest version will be installed by default. In this document, the latest version is Ceph V14.2.11. If you do not want to install the latest version, you can modify the /etc/yum.conf file. For example, you can perform the following operations to install Ceph V14.2.10: 1. Open the /etc/yum.conf file. vi /etc/yum.conf 2. Add the following information to the [main] section: exclude=*14.2.11* In this case, the 14.2.11 version is filtered out, and the latest version that can be installed becomes 14.2.10. Then, when you run the yum install command, Ceph 14.2.10 is installed. 3. Run the yum list ceph command to check the available version.

Step 1 Install Ceph on each server node and client node. yum -y install ceph

Step 2 Install ceph-deploy on ceph 1. yum -y install ceph-deploy

Step 3 Run the following command on each node to check the version: ceph -v The command output is similar to the following: ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 84 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

4.4.2 Deploying MON Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Create a cluster. cd /etc/ceph ceph-deploy new ceph1 ceph2 ceph3

Step 2 Configure mon_host, public network, and cluster network in the ceph.conf file that is automatically generated in /etc/ceph. vi /etc/ceph/ceph.conf Modify the content in the ceph.conf file as follows: [global] fsid = f6b3c38c-7241-44b3-b433-52e276dd53c6 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.166,192.168.3.167,192.168.3.168 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

NO TE

● Run the command for configuring nodes and use ceph-deploy to configure the OSD in the /etc/ceph directory. Otherwise, an error may occur. ● The modification is to isolate the internal cluster network from the external access network. 192.168.4.0 is used for data synchronization between internal storage clusters (used only between storage nodes), and 192.168.3.0 is used for data exchange between storage nodes and compute nodes. Step 3 Initialize the monitor and collect the keys. ceph-deploy mon create-initial

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 85 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 4 Copy ceph.client.admin.keyring to each node. ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3 client1 client2 client3

Step 5 Check whether the configuration is successful. ceph -s The configuration is successful if the command output is similar to the following:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h)

----End 4.4.3 Deploying MGR Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Deploy MGR nodes. ceph-deploy mgr create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 86 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 2 Check whether the MGR nodes are successfully deployed. ceph -s The MGR nodes are successfully deployed if the command output is similar to the following: cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph1(active, since 2d), standbys: ceph2, ceph3

----End 4.4.4 Deploying OSD Nodes

Creating OSD Partitions

NO TE

Perform the following operations on the three Ceph nodes. The following uses /dev/ nvme0n1 as an example. If the system has multiple NVMe SSDs or SATA/SAS SSDs, you only need to change /dev/nvme0n1 to the actual driver letters. Table 4-7 lists the partitions of an NVMe drive.

Table 4-7 Partitions of an NVMe drive

NVMe Drive DB Partition WAL Partition Metadata Storage Pool Partition

3.2 TB 12 x 45 GB 12 x 20 GB 3 x 700 GB

Step 1 According to the preceding partition plan, create a partition.sh script on ceph 1, ceph 2, and ceph 3. vi partition.sh Step 2 Add the following information: #!/bin/bash

parted /dev/nvme0n1 mklabel gpt

for j in `seq 1 12`

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 87 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

do ((b = $(( $j * 3 )))) ((a = $(( $b - 3 )))) ((c = $(( $b - 2 )))) str="%" echo $a echo $b echo $c parted /dev/nvme0n1 mkpart primary ${a}${str} ${c}${str} parted /dev/nvme0n1 mkpart primary ${c}${str} ${b}${str} done

parted /dev/nvme0n1 mkpart primary 36% 56% parted /dev/nvme0n1 mkpart primary 56% 76% parted /dev/nvme0n1 mkpart primary 76% 96% done

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script.

Step 3 Run the script. bash partition.sh

----End

Deploying OSD Nodes

NO TE

● The drives that were used as data drives in previous Ceph clusters or OS drives may have residual partitions. You can run the lsblk command to check the drive partitions. For example, if /dev/sdb has partitions, run the following command to clear the partitions: ceph-volume lvm zap /dev/sdb --destroy ● In the following script, all the drives are data drives. However, if the data drives are not numbered consecutively, for example, the OS is installed on /dev/sde, you cannot run the script directly. Otherwise, an error will be reported during the deployment on /dev/ sde. Instead, you need to modify the script to ensure that only data drives are operated and other drives such as the OS drive and SSD drive for DB and WAL partitions are not operated.

Step 1 Create the create_osd.sh script on ceph 1. cd /etc/ceph/ vi /etc/ceph/create_osd.sh

Step 2 Add the following information: #!/bin/bash

for node in ceph1 ceph2 ceph3 do j=1 k=2 for i in {a..l} do ceph-deploy osd create ${node} --data /dev/sd${i} --block-wal /dev/nvme0n1p${j} --block-db /dev/ nvme0n1p${k} ((j=${j}+2)) ((k=${k}+2)) done

for j in {25..27} do

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 88 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

ceph-deploy osd create ${node} --data /dev/nvme0n1p$j done done

NO TE

In the ceph-deploy osd create command: ● ${node} specifies the hostname of the node. ● --data specifies the data drive. ● --block-db specifies the DB partition. ● --block-wal specifies the WAL partition. DB and WAL partitions are usually deployed on NVMe SSDs to improve write performance. If no NVMe SSD is configured or NVMe SSDs are used as data drives, you do not need to specify --block-db and --block-wal. Instead, you only need to specify --data. Step 3 Run the script. bash create_osd.sh Step 4 Check whether all the 45 OSDs are in the up state. ceph -s

----End

4.5 Verifying Ceph

4.5.1 Deploying RGW Nodes In this example, create 12 RGW instances for each of the three nodes, and set the gateway ports to 10001 to 10036 and the gateway names to bucket1 to bucket36.

Editing the ceph.conf File

Step 1 Open the ceph.conf file on ceph 1. vim /etc/ceph/ceph.conf Add the port configuration of the RGW instances to the file.

[global] fsid = 4f238985-ad0a-4fc3-944b-da59ea3e65d7 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.163,192.168.3.164,192.168.3.165 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

[client.rgw.bucket1] rgw_frontends = civetweb port=10001 log file = /var/log/ceph/client.rgw.bucket1.log [client.rgw.bucket2] rgw_frontends = civetweb port=10002 log file = /var/log/ceph/client.rgw.bucket2.log [client.rgw.bucket3] rgw_frontends = civetweb port=10003

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 89 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

log file = /var/log/ceph/client.rgw.bucket3.log [client.rgw.bucket4] rgw_frontends = civetweb port=10004 log file = /var/log/ceph/client.rgw.bucket4.log [client.rgw.bucket5] rgw_frontends = civetweb port=10005 log file = /var/log/ceph/client.rgw.bucket5.log [client.rgw.bucket6] rgw_frontends = civetweb port=10006 log file = /var/log/ceph/client.rgw.bucket6.log [client.rgw.bucket7] rgw_frontends = civetweb port=10007 log file = /var/log/ceph/client.rgw.bucket7.log [client.rgw.bucket8] rgw_frontends = civetweb port=10008 log file = /var/log/ceph/client.rgw.bucket8.log [client.rgw.bucket9] rgw_frontends = civetweb port=10009 log file = /var/log/ceph/client.rgw.bucket9.log [client.rgw.bucket10] rgw_frontends = civetweb port=10010 log file = /var/log/ceph/client.rgw.bucket10.log [client.rgw.bucket11] rgw_frontends = civetweb port=10011 log file = /var/log/ceph/client.rgw.bucket11.log [client.rgw.bucket12] rgw_frontends = civetweb port=10012 log file = /var/log/ceph/client.rgw.bucket12.log [client.rgw.bucket13] rgw_frontends = civetweb port=10013 log file = /var/log/ceph/client.rgw.bucket13.log [client.rgw.bucket14] rgw_frontends = civetweb port=10014 log file = /var/log/ceph/client.rgw.bucket14.log [client.rgw.bucket15] rgw_frontends = civetweb port=10015 log file = /var/log/ceph/client.rgw.bucket15.log [client.rgw.bucket16] rgw_frontends = civetweb port=10016 log file = /var/log/ceph/client.rgw.bucket16.log [client.rgw.bucket17] rgw_frontends = civetweb port=10017 log file = /var/log/ceph/client.rgw.bucket17.log [client.rgw.bucket18] rgw_frontends = civetweb port=10018 log file = /var/log/ceph/client.rgw.bucket18.log [client.rgw.bucket19] rgw_frontends = civetweb port=10019 log file = /var/log/ceph/client.rgw.bucket19.log [client.rgw.bucket20] rgw_frontends = civetweb port=10020 log file = /var/log/ceph/client.rgw.bucket20.log [client.rgw.bucket21] rgw_frontends = civetweb port=10021 log file = /var/log/ceph/client.rgw.bucket21.log [client.rgw.bucket22] rgw_frontends = civetweb port=10022 log file = /var/log/ceph/client.rgw.bucket22.log [client.rgw.bucket23] rgw_frontends = civetweb port=10023 log file = /var/log/ceph/client.rgw.bucket23.log [client.rgw.bucket24] rgw_frontends = civetweb port=10024 log file = /var/log/ceph/client.rgw.bucket24.log [client.rgw.bucket25] rgw_frontends = civetweb port=10025 log file = /var/log/ceph/client.rgw.bucket25.log [client.rgw.bucket26] rgw_frontends = civetweb port=10026

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 90 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

log file = /var/log/ceph/client.rgw.bucket26.log [client.rgw.bucket27] rgw_frontends = civetweb port=10027 log file = /var/log/ceph/client.rgw.bucket27.log [client.rgw.bucket28] rgw_frontends = civetweb port=10028 log file = /var/log/ceph/client.rgw.bucket28.log [client.rgw.bucket29] rgw_frontends = civetweb port=10029 log file = /var/log/ceph/client.rgw.bucket29.log [client.rgw.bucket30] rgw_frontends = civetweb port=10030 log file = /var/log/ceph/client.rgw.bucket30.log [client.rgw.bucket31] rgw_frontends = civetweb port=10031 log file = /var/log/ceph/client.rgw.bucket31.log [client.rgw.bucket32] rgw_frontends = civetweb port=10032 log file = /var/log/ceph/client.rgw.bucket32.log [client.rgw.bucket33] rgw_frontends = civetweb port=10033 log file = /var/log/ceph/client.rgw.bucket33.log [client.rgw.bucket34] rgw_frontends = civetweb port=10034 log file = /var/log/ceph/client.rgw.bucket34.log [client.rgw.bucket35] rgw_frontends = civetweb port=10035 log file = /var/log/ceph/client.rgw.bucket35.log [client.rgw.bucket36] rgw_frontends = civetweb port=10036 log file = /var/log/ceph/client.rgw.bucket36.log Step 2 Run the following command on ceph 1 to synchronize configuration files on all cluster nodes: ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3

----End

Creating RGW Instances Step 1 Install the RGW component on all server nodes. yum -y install ceph-radosgw Step 2 Create RGW instances on ceph 1. for i in {1..12};do ceph-deploy rgw create ceph1:bucket$i;done for i in {13..24};do ceph-deploy rgw create ceph2:bucket$i;done for i in {25..36};do ceph-deploy rgw create ceph3:bucket$i;done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 91 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 3 Check whether the 36 RGW processes are online. ceph -s The following information is displayed, indicating that the 36 RGW processes are online:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph3(active, since 2d), standbys: ceph2, ceph1 osd: 108 osds: 108 up (since 25h), 108 in (since 9d) rgw: 36 daemons active (bucket1, bucket10, bucket11, bucket12, bucket13, bucket14, bucket15, bucket16, bucket17, bucket18, bucket19, bucket2, bucket20, bucket21, bucket22, bucket23, bucket24, bucket25, bucket26, bucket27, bucket28, bucket29, bucket3, bucket30, bucket31, bucket32, bucket33, bucket34, bucket35, bucket36, bucket4, bucket5, bucket6, bucket7, bucket8, bucket9)

----End 4.5.2 Creating a Storage Pool Object storage requires multiple storage pools. You can create storage pools with small data volumes, such as the metadata storage pool, on SSDs to improve performance. This section describes how to create a metadata object storage pool on an SSD and a data object storage pool on an HDD. By default, Ceph storage pools use the three-replica mode. Data object storage pools sometimes use the erasure coding (EC) mode to save storage space. The following describes how to create a storage pool in the replication mode and in EC mode. See Creating a Storage Pool in Replication Mode to use the replication mode. See Creating a Storage Pool in EC Mode to use the EC mode.

Creating a Storage Pool in Replication Mode

Step 1 Run the following command on ceph 1 to check the crush class: ceph osd crush class ls

If the server has OSDs created based on SSDs and HDDs, the two drive types are displayed in the crush class.

[ "hdd", "ssd" ] Step 2 Run the following command on ceph 1 to create a crush rule for the ssd class and the hdd class respectively: ceph osd crush rule create-replicated rule-ssd default host ssd ceph osd crush rule create-replicated rule-hdd default host hdd

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 92 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 3 Check whether the crush rules are successfully created. ceph osd crush rule ls

The crush rules of the current cluster are as follows:

replicated_rule rule-ssd rule-hdd

replicated_rule is the default crush rule used by the cluster. If no crush rule is specified, replicated_rule is used by default. replicated_rule is in the three-replica mode. All data in the storage pool is stored on all storage devices (SSDs and HDDs) based on a certain proportion. rule-ssd allows data to be stored only on SSDs, and rule-hdd allows data to be stored only on HDDs.

Step 4 Create a data pool and an index pool on ceph 1. ceph osd pool create default.rgw.buckets.data 1024 1024 ceph osd pool create default.rgw.buckets.index 256 256 ceph osd pool application enable default.rgw.buckets.data rgw ceph osd pool application enable default.rgw.buckets.index rgw

NO TE

● The two 1024 numbers in the storage pool creation command (for example, ceph osd pool create default.rgw.buckets.data 1024 1024) respectively correspond to the pg_num and pgp_num parameters of the storage pool. According to the official Ceph document, the recommended total number of storage pool placement groups (PGs) in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of copies. For the EC mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replica mode and 6 for the EC4+2 mode. ● In this example, there are three servers in the cluster. Each server has 15 OSDs and there are 45 OSDs in total. According to the preceding formula, the PG quantity is 1500. It is recommended that the PG quantity be the integral power of 2. The data volume of default.rgw.buckets.data is much larger than other storage pools and therefore more PGs are allocated to this storage pool. The PG quantity of default.rgw.buckets.data is 1024, and the PG quantity of default.rgw.buckets.index is 128 or 256.

Step 5 Run the following commands on ceph 1 to modify the crush rules of all storage pools. for i in `ceph osd lspools | grep -v data | awk '{print $2}'`; do ceph osd pool set $i crush_rule rule-ssd; done ceph osd pool set default.rgw.buckets.data crush_rule rule-hdd

Step 6 Run the following commands on ceph 1, ceph 2, and ceph 3 to cancel the proxy configuration. unset http_proxy unset https_proxy

Step 7 Use cURL or a web browser to log in to the nodes for verification. Ensure that the IP address matches the port number. For example, the port number 10013 corresponds to the IP address 192.168.3.164. If the following information is displayed, the RGWs are created successfully.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 93 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

The gateway service is successfully created.

----End

Creating a Storage Pool in EC Mode

Step 1 Run the following command on ceph 1 to check the crush class: ceph osd crush class ls

If the server has OSDs created based on SSDs and HDDs, the two drive types are displayed in the crush class.

[ "hdd", "ssd" ]

Step 2 Run the following command on ceph 1 to create a crush rule for the ssd class: ceph osd crush rule create-replicated rule-hdd default host hdd

Step 3 Run the following command on ceph 1 to check whether the crush rules are successfully created. ceph osd crush rule ls

The crush rules of the current cluster are as follows. replicated_rule is the default crush rule used by the cluster. If no crush rule is specified, replicated_rule is used by default. replicated_rule is in the three-replica mode. All data in the storage pool is stored on all storage devices (SSDs and HDDs) based on a certain proportion. rule-ssd allows data to be stored only on SSDs.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 94 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

replicated_rule rule-hdd

Step 4 Create an EC profile. ceph osd erasure-code-profile set myprofile k=4 m=2 crush-failure-domain=osd crush-device-class=hdd

NO TE

The EC 4+2 mode is used as an example. The preceding command creates an EC profile named myprofile. k specifies the number of data blocks, m specifies the number of verification blocks, crush-failure-domain=host indicates that the minimum fault domain is a host, and crush-device-class=hdd indicates that an HDD-based crush rule is used. Generally, the minimum fault domain is set to host. If the number of hosts is less than k +m, the fault domain needs to be changed to osd. Otherwise, an error may occur due to insufficient number of hosts.

Step 5 Create a data pool and an index pool on ceph 1. ceph osd pool create default.rgw.buckets.data 2048 2048 erasure myprofile ceph osd pool create default.rgw.buckets.index 256 256 ceph osd pool application enable default.rgw.buckets.data rgw ceph osd pool application enable default.rgw.buckets.index rgw

NO TE

● The ceph-deploy osd pool create default.rgw.buckets.data 2048 2048 erasure myprofile command is used to create a pool in EC mode. For object storage, only default.rgw.buckets.data needs to be set to the EC mode. Other pools still use the default three-replica mode. ● The two 2048 numbers in the storage pool creation command (for example, ceph osd pool create default.rgw.buckets.data 2048 2048 erasure myprofile) respectively correspond to the pg_num and pgp_num parameters of the storage pool. According to the official Ceph document, the recommended total number of storage pool PGs in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of copies. For the EC mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replica mode and 6 for the EC4+2 mode. ● In this example, there are three servers in the cluster. Each server has 36 OSDs and there are 108 OSDs in total. According to the preceding formula, the PG quantity is 1800. It is recommended that the PG quantity be the integral power of 2. The data volume of default.rgw.buckets.data is much larger than other storage pools and therefore more PGs are allocated to this storage pool. The PG quantity of default.rgw.buckets.data is 2048, and the PG quantity of default.rgw.buckets.index is 128 or 256.

Step 6 Run the following commands on ceph 1 to modify the crush rules of all storage pools except the data pool. for i in `ceph osd lspools | grep -v data | awk '{print $2}'`; do ceph osd pool set $i crush_rule rule-ssd; done

Step 7 Run the following commands on ceph 1, ceph 2, and ceph 3 to cancel the proxy configuration. unset http_proxy unset https_proxy

Step 8 Use cURL or a web browser to log in to the nodes for verification. Information in the following figure is displayed.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 95 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

The gateway service is successfully created.

----End 4.5.3 Creating an RGW Account To access Ceph object storage from the client, you need to create an RGW account.

Step 1 Create an RGW account on ceph 1. radosgw-admin user create --uid="admin" --display-name="admin user" Step 2 After the account is created, run the following command to query the account information: radosgw-admin user info --uid=admin

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 96 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

The Ceph RGW OSD hybrid deployment is complete.

----End 4.5.4 Enabling RGW Data Compression After creating an RGW account, you can use it to access the RGW. To enable RGW data compression, create a storage pool for storing compressed data, add a data placement policy, and specify the compression algorithm. This section describes how to enable the RGW data compression function.

Creating a Storage Pool for Data Compression Run the following commands to create a storage pool for data compression:

ceph osd pool create default.rgw.buckets.data-compress 4096 4096 ceph osd pool create default.rgw.buckets.index-compress 256 256 ceph osd pool create default.rgw.buckets.non-ec-compress 64 64

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 97 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

ceph osd pool application enable default.rgw.buckets.data-compress rgw ceph osd pool application enable default.rgw.buckets.index-compress rgw ceph osd pool application enable default.rgw.buckets.non-ec-compress rgw

NO TE

The storage pool is created for enabling RGW data compression. The created storage pool is divided into data_pool, index_pool, and data_extra_pool, which are used in the placement policy to be created in the following operations. The default.rgw.buckets.data-compress command can also be used to create compressed data storage pools in EC mode. For details, see Creating a Storage Pool in EC Mode.

Adding a Placement Policy The Ceph object storage cluster has a default placement policy default- placement. You need to create the placement policy compress-placement for RGW data compression.

Step 1 Run the following command on ceph 1 to create the placement policy compress- placement: radosgw-admin zonegroup placement add --rgw-zonegroup=default --placement-id=compress- placement

Step 2 On ceph 1, run the following command to enter the compress-placement information, including the storage pool and compression algorithm of the placement policy: radosgw-admin zone placement add --rgw-zone=default --placement-id=compress-placement -- index_pool=default.rgw.buckets.index-compress --data_pool=default.rgw.buckets.data-compress -- data_extra_pool=default.rgw.buckets.non-ec-compress --compression=zlib

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 98 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

NO TE

"--compression=zlib" indicates that the zlib compression algorithm is used. You can also use Snappy or LZ4.

----End

Enabling the Placement Policy

Step 1 Create the admin-compress user. Run the following command on ceph 1: radosgw-admin user create --uid="admin-compress" --display-name="admin compress user"

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 99 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 2 Export user metadata. The following uses the user.json file as an example. Run the following command on ceph 1: radosgw-admin metadata get user:admin-compress > user.json

Step 3 Open the user.json metadata file. vi user.json

Change the value of default_placement to compress-placement.

Step 4 Run the following command on ceph 1 to import the modified user metadata: radosgw-admin metadata put user:admin-compress < user.json

Step 5 Restart the radosgw process. Run the following command on each storage node: systemctl restart ceph-radosgw.target

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 100 Kunpeng BoostKit for SDS 4 Ceph Object Storage Deployment Guide (CentOS Deployment Guides 7.6)

Step 6 Run the following command on ceph 1 and check whether the value of default_placement is compress-placement. If so, the placement policy takes effect. radosgw-admin user info --uid="admin-compress"

The bucket created by the admin-compress user will use the compress- placement policy to complete the compression.

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 101 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

5 Ceph Object Storage Deployment Guide (openEuler 20.03)

5.1 Introduction 5.2 Environment Requirements 5.3 Configuring the Deployment Environment 5.4 Installing Ceph 5.5 Verifying Ceph

5.1 Introduction

Overview Ceph is a distributed, scalable, reliable, and high-performance storage system platform that supports storage interfaces including block devices, file systems, and object gateways. Figure 5-1 shows the Ceph architecture. This document describes how to deploy Ceph. Before installing Ceph, disable the firewall, configure the hostname, configure the time service, configure password- free login, disable SELinux, and configure the software sources. Then, run yum commands to install Ceph and deploy the Monitor (MON), Manager (MGR), and Object Storage Daemon (OSD) nodes. Finally, verify Ceph to complete the deployment.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 102 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Figure 5-1 Ceph architecture

Table 5-1 describes the modules in the preceding figure.

Table 5-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 103 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Operating System Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version

14.2.10

5.2 Environment Requirements

Hardware Requirements

Table 5-2 lists the hardware requirements.

Table 5-2 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 104 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Cores 2 x 32-core

CPU frequency 2600 MHz

Memory capacity 8 x 16 GB

Memory frequency 2933 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drives System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID controller card LSI SAS3508

NO TE

The installation of Ceph and its dependencies requires Internet access. Ensure that the server is connected to the Internet.

Software Requirements Table 5-3 lists the software requirements.

Table 5-3 Required software versions Software Version

OS openEuler 20.03

Ceph 14.2.10 Nautilus

ceph-deploy 2.0.1

NO TE

● This document uses Ceph 14.2.10 as an example. You can also refer to this document to install other versions. ● If Ceph is installed on the OS for the first time, you are not advised to use minimum installation. Otherwise, you may need to manually install many software packages. You can select the Server with GUI installation mode.

Cluster Environment Planning Figure 5-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 105 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Figure 5-2 Physical networking diagram

Table 5-4 lists the cluster deployment plan.

Table 5-4 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 5-5 lists the client deployment plan.

Table 5-5 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 106 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

Drive Partitioning Ceph 14.2.10 uses BlueStore as the back-end storage engine. The Journal partition in the Jewel version is no longer used. Instead, the DB partition (metadata partition) and WAL partition are used. The two partitions store back-end metadata and log files generated by BlueStore. The metadata is used to improve the efficiency of the entire storage system, and the logs are used to maintain system stability. In cluster deployment mode, each Ceph node is configured with twelve 4 TB data drives and one 3.2 TB NVMe drive. Each 4 TB data drive functions as the data partition of an Object Storage Daemon (OSD), and the NVMe drive functions as the DB and WAL partitions of the 12 OSDs. Generally, the WAL partition is sufficient if its capacity is greater than 10 GB. According to the official Ceph document, it is recommended that the size of each DB partition be at least 4% of the capacity of each data drive. The size of each DB partition can be flexibly configured based on the NVMe drive capacity. In this solution, the NVMe drive is divided into 27 partitions. Twelve 20 GB partitions are used as WAL, and twelve 45 GB partitions are used as DB. To improve the object storage performance, each NVMe drive has three 700 GB partitions as the storage pool for storing RGW metadata. Table 5-6 lists the partitions of an NVMe drive.

Table 5-6 Partitions of an NVMe drive NVMe Drive DB Partition WAL Partition Metadata Storage Pool Partition

3.2 TB 12 x 45 GB 12 x 20 GB 3 x 700 GB

5.3 Configuring the Deployment Environment

Configuring the EPEL Source Perform the following operations on each server node and client node to configure the Extra Packages for Enterprise Linux (EPEL) source: Step 1 Upload the everything image source file corresponding to the OS to the server. Use the SFTP tool to upload the openEuler-***-everything-aarch64-dvd.iso package to the /root directory on the server.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 107 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 2 Create a local folder to mount the image. mkdir -p /iso

Step 3 Mount the ISO file to the local directory. mount /root/openEuler-***-everything-aarch64-dvd.iso /iso

Step 4 Create a Yum source for the image. vi /etc/yum.repos.d/openEuler.repo

Add the following information to the file: [Base] name=Base baseurl=file:///iso enabled=1 gpgcheck=0 priority=1

Add an external image source. [arch_fedora_online] name=arch_fedora baseurl=https://mirrors.huaweicloud.com/fedora/releases/30/Everything/aarch64/os/ enabled=1 gpgcheck=0 priority=2

----End

Disabling the Firewall On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 108 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file: 192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Configuring NTP Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit. Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 109 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

Configuring Password-Free Login Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 110 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Disabling SELinux

Disable SELinux on each server node and client node.

● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 111 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Configuring the Ceph Image Source

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/yum.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 112 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

----End

Changing the umask Value Change the value of umask to 0022 so that Ceph can be properly installed.

Step 1 Open the bashrc file in all clusters. vi /etc/bashrc Change the last line to umask 0022. Step 2 Make the configuration take effect. source /etc/bashrc Step 3 Run the following command to check whether the configuration has taken effect: umask

----End

5.4 Installing Ceph

5.4.1 Installing the Ceph Software

NO TE

When you use yum install to install Ceph, the latest version will be installed by default. In this document, the latest version is Ceph V14.2.11. If you do not want to install the latest version, you can modify the /etc/yum.conf file. For example, you can perform the following operations to install Ceph V14.2.10: 1. Open the /etc/yum.conf file. vi /etc/yum.conf 2. Add the following information to the [main] section: exclude=*14.2.11* In this case, the 14.2.11 version is filtered out, and the latest version that can be installed becomes 14.2.10. Then, when you run the yum install command, Ceph 14.2.10 is installed. 3. Run the yum list ceph command to check the available version.

Step 1 Install Ceph on each cluster node and client node.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 113 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

dnf -y install ceph

Step 2 Install ceph-deploy on ceph 1. pip install ceph-deploy pip install prettytable

Step 3 Add a line of code to the _get_distro function in the /lib/python2.7/site- packages/ceph_deploy/hosts/__init__.py file to adapt the software to the openEuler system. vi /lib/python2.7/site-packages/ceph_deploy/hosts/__init__.py 'openeuler': fedora,

Step 4 Run the following command on each node to check the version: ceph -v The command output is similar to the following:

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 114 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

----End 5.4.2 Deploying MON Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Create a cluster. cd /etc/ceph ceph-deploy new ceph1 ceph2 ceph3

Step 2 Configure mon_host, public network, and cluster network in the ceph.conf file that is automatically generated in /etc/ceph. vi /etc/ceph/ceph.conf

Modify the content in the ceph.conf file as follows: [global] fsid = f6b3c38c-7241-44b3-b433-52e276dd53c6 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.166,192.168.3.167,192.168.3.168 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

NO TE

● Run the command for configuring nodes and use ceph-deploy to configure the OSD in the /etc/ceph directory. Otherwise, an error may occur. ● The modification is to isolate the internal cluster network from the external access network. 192.168.4.0 is used for data synchronization between internal storage clusters (used only between storage nodes), and 192.168.3.0 is used for data exchange between storage nodes and compute nodes.

Step 3 Initialize the monitor and collect the keys. ceph-deploy mon create-initial

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 115 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 4 Copy ceph.client.admin.keyring to each node. ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3 client1 client2 client3

Step 5 Check whether the configuration is successful. ceph -s The configuration is successful if the command output is similar to the following:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h)

----End 5.4.3 Deploying MGR Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Deploy MGR nodes. ceph-deploy mgr create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 116 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 2 Check whether the MGR nodes are successfully deployed. ceph -s The MGR nodes are successfully deployed if the command output is similar to the following: cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph1(active, since 2d), standbys: ceph2, ceph3

----End 5.4.4 Deploying OSD Nodes

Creating OSD Partitions

NO TE

Perform the following operations on the three Ceph nodes. The following uses /dev/ nvme0n1 as an example. If the system has multiple NVMe SSDs or SATA/SAS SSDs, you only need to change /dev/nvme0n1 to the actual driver letters. Table 5-7 lists the partitions of an NVMe drive.

Table 5-7 Partitions of an NVMe drive

NVMe Drive DB Partition WAL Partition Metadata Storage Pool Partition

3.2 TB 12 x 45 GB 12 x 20 GB 3 x 700 GB

Step 1 According to the preceding partition plan, create a partition.sh script on ceph 1, ceph 2, and ceph 3. vi partition.sh Step 2 Add the following information: #!/bin/bash

parted /dev/nvme0n1 mklabel gpt

for j in `seq 1 12`

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 117 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

do ((b = $(( $j * 3 )))) ((a = $(( $b - 3 )))) ((c = $(( $b - 2 )))) str="%" echo $a echo $b echo $c parted /dev/nvme0n1 mkpart primary ${a}${str} ${c}${str} parted /dev/nvme0n1 mkpart primary ${c}${str} ${b}${str} done

parted /dev/nvme0n1 mkpart primary 36% 56% parted /dev/nvme0n1 mkpart primary 56% 76% parted /dev/nvme0n1 mkpart primary 76% 96% done

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script.

Step 3 Run the script. bash partition.sh

----End

Deploying OSD Nodes

NO TE

● The drives that were used as data drives in previous Ceph clusters or OS drives may have residual partitions. You can run the lsblk command to check the drive partitions. For example, if /dev/sdb has partitions, run the following command to clear the partitions: ceph-volume lvm zap /dev/sdb --destroy ● In the following script, all the drives are data drives. However, if the data drives are not numbered consecutively, for example, the OS is installed on /dev/sde, you cannot run the script directly. Otherwise, an error will be reported during the deployment on /dev/ sde. Instead, you need to modify the script to ensure that only data drives are operated and other drives such as the OS drive and SSD drive for DB and WAL partitions are not operated.

Step 1 Create the create_osd.sh script on ceph 1. cd /etc/ceph/ vi /etc/ceph/create_osd.sh

Step 2 Add the following information: #!/bin/bash

for node in ceph1 ceph2 ceph3 do j=1 k=2 for i in {a..l} do ceph-deploy osd create ${node} --data /dev/sd${i} --block-wal /dev/nvme0n1p${j} --block-db /dev/ nvme0n1p${k} ((j=${j}+2)) ((k=${k}+2)) done

for j in {25..27} do

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 118 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

ceph-deploy osd create ${node} --data /dev/nvme0n1p$j done done

NO TE

In the ceph-deploy osd create command: ● ${node} specifies the hostname of the node. ● --data specifies the data drive. ● --block-db specifies the DB partition. ● --block-wal specifies the WAL partition. DB and WAL partitions are usually deployed on NVMe SSDs to improve write performance. If no NVMe SSD is configured or NVMe SSDs are used as data drives, you do not need to specify --block-db and --block-wal. Instead, you only need to specify --data. Step 3 Run the script. bash create_osd.sh Step 4 Check whether all the 45 OSDs are in the up state. ceph -s

----End

5.5 Verifying Ceph

5.5.1 Deploying RGW Nodes In this example, create 12 RGW instances for each of the three nodes, and set the gateway ports to 10001 to 10036 and the gateway names to bucket1 to bucket36.

Editing the ceph.conf File

Step 1 Open the ceph.conf file on ceph 1. vim /etc/ceph/ceph.conf Add the port configuration of the RGW instances to the file.

[global] fsid = 4f238985-ad0a-4fc3-944b-da59ea3e65d7 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.163,192.168.3.164,192.168.3.165 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

[client.rgw.bucket1] rgw_frontends = civetweb port=10001 log file = /var/log/ceph/client.rgw.bucket1.log [client.rgw.bucket2] rgw_frontends = civetweb port=10002 log file = /var/log/ceph/client.rgw.bucket2.log [client.rgw.bucket3] rgw_frontends = civetweb port=10003

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 119 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

log file = /var/log/ceph/client.rgw.bucket3.log [client.rgw.bucket4] rgw_frontends = civetweb port=10004 log file = /var/log/ceph/client.rgw.bucket4.log [client.rgw.bucket5] rgw_frontends = civetweb port=10005 log file = /var/log/ceph/client.rgw.bucket5.log [client.rgw.bucket6] rgw_frontends = civetweb port=10006 log file = /var/log/ceph/client.rgw.bucket6.log [client.rgw.bucket7] rgw_frontends = civetweb port=10007 log file = /var/log/ceph/client.rgw.bucket7.log [client.rgw.bucket8] rgw_frontends = civetweb port=10008 log file = /var/log/ceph/client.rgw.bucket8.log [client.rgw.bucket9] rgw_frontends = civetweb port=10009 log file = /var/log/ceph/client.rgw.bucket9.log [client.rgw.bucket10] rgw_frontends = civetweb port=10010 log file = /var/log/ceph/client.rgw.bucket10.log [client.rgw.bucket11] rgw_frontends = civetweb port=10011 log file = /var/log/ceph/client.rgw.bucket11.log [client.rgw.bucket12] rgw_frontends = civetweb port=10012 log file = /var/log/ceph/client.rgw.bucket12.log [client.rgw.bucket13] rgw_frontends = civetweb port=10013 log file = /var/log/ceph/client.rgw.bucket13.log [client.rgw.bucket14] rgw_frontends = civetweb port=10014 log file = /var/log/ceph/client.rgw.bucket14.log [client.rgw.bucket15] rgw_frontends = civetweb port=10015 log file = /var/log/ceph/client.rgw.bucket15.log [client.rgw.bucket16] rgw_frontends = civetweb port=10016 log file = /var/log/ceph/client.rgw.bucket16.log [client.rgw.bucket17] rgw_frontends = civetweb port=10017 log file = /var/log/ceph/client.rgw.bucket17.log [client.rgw.bucket18] rgw_frontends = civetweb port=10018 log file = /var/log/ceph/client.rgw.bucket18.log [client.rgw.bucket19] rgw_frontends = civetweb port=10019 log file = /var/log/ceph/client.rgw.bucket19.log [client.rgw.bucket20] rgw_frontends = civetweb port=10020 log file = /var/log/ceph/client.rgw.bucket20.log [client.rgw.bucket21] rgw_frontends = civetweb port=10021 log file = /var/log/ceph/client.rgw.bucket21.log [client.rgw.bucket22] rgw_frontends = civetweb port=10022 log file = /var/log/ceph/client.rgw.bucket22.log [client.rgw.bucket23] rgw_frontends = civetweb port=10023 log file = /var/log/ceph/client.rgw.bucket23.log [client.rgw.bucket24] rgw_frontends = civetweb port=10024 log file = /var/log/ceph/client.rgw.bucket24.log [client.rgw.bucket25] rgw_frontends = civetweb port=10025 log file = /var/log/ceph/client.rgw.bucket25.log [client.rgw.bucket26] rgw_frontends = civetweb port=10026

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 120 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

log file = /var/log/ceph/client.rgw.bucket26.log [client.rgw.bucket27] rgw_frontends = civetweb port=10027 log file = /var/log/ceph/client.rgw.bucket27.log [client.rgw.bucket28] rgw_frontends = civetweb port=10028 log file = /var/log/ceph/client.rgw.bucket28.log [client.rgw.bucket29] rgw_frontends = civetweb port=10029 log file = /var/log/ceph/client.rgw.bucket29.log [client.rgw.bucket30] rgw_frontends = civetweb port=10030 log file = /var/log/ceph/client.rgw.bucket30.log [client.rgw.bucket31] rgw_frontends = civetweb port=10031 log file = /var/log/ceph/client.rgw.bucket31.log [client.rgw.bucket32] rgw_frontends = civetweb port=10032 log file = /var/log/ceph/client.rgw.bucket32.log [client.rgw.bucket33] rgw_frontends = civetweb port=10033 log file = /var/log/ceph/client.rgw.bucket33.log [client.rgw.bucket34] rgw_frontends = civetweb port=10034 log file = /var/log/ceph/client.rgw.bucket34.log [client.rgw.bucket35] rgw_frontends = civetweb port=10035 log file = /var/log/ceph/client.rgw.bucket35.log [client.rgw.bucket36] rgw_frontends = civetweb port=10036 log file = /var/log/ceph/client.rgw.bucket36.log Step 2 Run the following command on ceph 1 to synchronize configuration files on all cluster nodes: ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3

----End

Creating RGW Instances Step 1 Install the RGW component on all server nodes. yum -y install ceph-radosgw Step 2 Create RGW instances on ceph 1. for i in {1..12};do ceph-deploy rgw create ceph1:bucket$i;done for i in {13..24};do ceph-deploy rgw create ceph2:bucket$i;done for i in {25..36};do ceph-deploy rgw create ceph3:bucket$i;done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 121 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 3 Check whether the 36 RGW processes are online. ceph -s The following information is displayed, indicating that the 36 RGW processes are online:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph3(active, since 2d), standbys: ceph2, ceph1 osd: 108 osds: 108 up (since 25h), 108 in (since 9d) rgw: 36 daemons active (bucket1, bucket10, bucket11, bucket12, bucket13, bucket14, bucket15, bucket16, bucket17, bucket18, bucket19, bucket2, bucket20, bucket21, bucket22, bucket23, bucket24, bucket25, bucket26, bucket27, bucket28, bucket29, bucket3, bucket30, bucket31, bucket32, bucket33, bucket34, bucket35, bucket36, bucket4, bucket5, bucket6, bucket7, bucket8, bucket9)

----End 5.5.2 Creating a Storage Pool Object storage requires multiple storage pools. You can create storage pools with small data volumes, such as the metadata storage pool, on SSDs to improve performance. This section describes how to create a metadata object storage pool on an SSD and a data object storage pool on an HDD. By default, Ceph storage pools use the three-replica mode. Data object storage pools sometimes use the erasure coding (EC) mode to save storage space. The following describes how to create a storage pool in the replication mode and in EC mode. See Creating a Storage Pool in Replication Mode to use the replication mode. See Creating a Storage Pool in EC Mode to use the EC mode.

Creating a Storage Pool in Replication Mode

Step 1 Run the following command on ceph 1 to check the crush class: ceph osd crush class ls

If the server has OSDs created based on SSDs and HDDs, the two drive types are displayed in the crush class.

[ "hdd", "ssd" ] Step 2 Run the following command on ceph 1 to create a crush rule for the ssd class and the hdd class respectively: ceph osd crush rule create-replicated rule-ssd default host ssd ceph osd crush rule create-replicated rule-hdd default host hdd

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 122 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 3 Check whether the crush rules are successfully created. ceph osd crush rule ls

The crush rules of the current cluster are as follows:

replicated_rule rule-ssd rule-hdd

replicated_rule is the default crush rule used by the cluster. If no crush rule is specified, replicated_rule is used by default. replicated_rule is in the three-replica mode. All data in the storage pool is stored on all storage devices (SSDs and HDDs) based on a certain proportion. rule-ssd allows data to be stored only on SSDs, and rule-hdd allows data to be stored only on HDDs.

Step 4 Create a data pool and an index pool on ceph 1. ceph osd pool create default.rgw.buckets.data 1024 1024 ceph osd pool create default.rgw.buckets.index 256 256 ceph osd pool application enable default.rgw.buckets.data rgw ceph osd pool application enable default.rgw.buckets.index rgw

NO TE

● The two 1024 numbers in the storage pool creation command (for example, ceph osd pool create default.rgw.buckets.data 1024 1024) respectively correspond to the pg_num and pgp_num parameters of the storage pool. According to the official Ceph document, the recommended total number of storage pool placement groups (PGs) in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of copies. For the EC mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replica mode and 6 for the EC4+2 mode. ● In this example, there are three servers in the cluster. Each server has 15 OSDs and there are 45 OSDs in total. According to the preceding formula, the PG quantity is 1500. It is recommended that the PG quantity be the integral power of 2. The data volume of default.rgw.buckets.data is much larger than other storage pools and therefore more PGs are allocated to this storage pool. The PG quantity of default.rgw.buckets.data is 1024, and the PG quantity of default.rgw.buckets.index is 128 or 256.

Step 5 Run the following commands on ceph 1 to modify the crush rules of all storage pools. for i in `ceph osd lspools | grep -v data | awk '{print $2}'`; do ceph osd pool set $i crush_rule rule-ssd; done ceph osd pool set default.rgw.buckets.data crush_rule rule-hdd

Step 6 Run the following commands on ceph 1, ceph 2, and ceph 3 to cancel the proxy configuration. unset http_proxy unset https_proxy

Step 7 Use cURL or a web browser to log in to the nodes for verification. Ensure that the IP address matches the port number. For example, the port number 10013 corresponds to the IP address 192.168.3.164. If the following information is displayed, the RGWs are created successfully.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 123 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

The gateway service is successfully created.

----End

Creating a Storage Pool in EC Mode

Step 1 Run the following command on ceph 1 to check the crush class: ceph osd crush class ls

If the server has OSDs created based on SSDs and HDDs, the two drive types are displayed in the crush class.

[ "hdd", "ssd" ]

Step 2 Run the following command on ceph 1 to create a crush rule for the ssd class: ceph osd crush rule create-replicated rule-hdd default host hdd

Step 3 Run the following command on ceph 1 to check whether the crush rules are successfully created. ceph osd crush rule ls

The crush rules of the current cluster are as follows. replicated_rule is the default crush rule used by the cluster. If no crush rule is specified, replicated_rule is used by default. replicated_rule is in the three-replica mode. All data in the storage pool is stored on all storage devices (SSDs and HDDs) based on a certain proportion. rule-ssd allows data to be stored only on SSDs.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 124 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

replicated_rule rule-hdd

Step 4 Create an EC profile. ceph osd erasure-code-profile set myprofile k=4 m=2 crush-failure-domain=osd crush-device-class=hdd

NO TE

The EC 4+2 mode is used as an example. The preceding command creates an EC profile named myprofile. k specifies the number of data blocks, m specifies the number of verification blocks, crush-failure-domain=host indicates that the minimum fault domain is a host, and crush-device-class=hdd indicates that an HDD-based crush rule is used. Generally, the minimum fault domain is set to host. If the number of hosts is less than k +m, the fault domain needs to be changed to osd. Otherwise, an error may occur due to insufficient number of hosts.

Step 5 Create a data pool and an index pool on ceph 1. ceph osd pool create default.rgw.buckets.data 2048 2048 erasure myprofile ceph osd pool create default.rgw.buckets.index 256 256 ceph osd pool application enable default.rgw.buckets.data rgw ceph osd pool application enable default.rgw.buckets.index rgw

NO TE

● The ceph-deploy osd pool create default.rgw.buckets.data 2048 2048 erasure myprofile command is used to create a pool in EC mode. For object storage, only default.rgw.buckets.data needs to be set to the EC mode. Other pools still use the default three-replica mode. ● The two 2048 numbers in the storage pool creation command (for example, ceph osd pool create default.rgw.buckets.data 2048 2048 erasure myprofile) respectively correspond to the pg_num and pgp_num parameters of the storage pool. According to the official Ceph document, the recommended total number of storage pool PGs in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of copies. For the EC mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replica mode and 6 for the EC4+2 mode. ● In this example, there are three servers in the cluster. Each server has 36 OSDs and there are 108 OSDs in total. According to the preceding formula, the PG quantity is 1800. It is recommended that the PG quantity be the integral power of 2. The data volume of default.rgw.buckets.data is much larger than other storage pools and therefore more PGs are allocated to this storage pool. The PG quantity of default.rgw.buckets.data is 2048, and the PG quantity of default.rgw.buckets.index is 128 or 256.

Step 6 Run the following commands on ceph 1 to modify the crush rules of all storage pools except the data pool. for i in `ceph osd lspools | grep -v data | awk '{print $2}'`; do ceph osd pool set $i crush_rule rule-ssd; done

Step 7 Run the following commands on ceph 1, ceph 2, and ceph 3 to cancel the proxy configuration. unset http_proxy unset https_proxy

Step 8 Use cURL or a web browser to log in to the nodes for verification. Information in the following figure is displayed.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 125 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

The gateway service is successfully created.

----End 5.5.3 Creating an RGW Account To access Ceph object storage from the client, you need to create an RGW account.

Step 1 Create an RGW account on ceph 1. radosgw-admin user create --uid="admin" --display-name="admin user" Step 2 After the account is created, run the following command to query the account information: radosgw-admin user info --uid=admin

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 126 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

The Ceph RGW OSD hybrid deployment is complete.

----End 5.5.4 Enabling RGW Data Compression After creating an RGW account, you can use it to access the RGW. To enable RGW data compression, create a storage pool for storing compressed data, add a data placement policy, and specify the compression algorithm. This section describes how to enable the RGW data compression function.

Creating a Storage Pool for Data Compression Run the following commands to create a storage pool for data compression:

ceph osd pool create default.rgw.buckets.data-compress 4096 4096 ceph osd pool create default.rgw.buckets.index-compress 256 256 ceph osd pool create default.rgw.buckets.non-ec-compress 64 64

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 127 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

ceph osd pool application enable default.rgw.buckets.data-compress rgw ceph osd pool application enable default.rgw.buckets.index-compress rgw ceph osd pool application enable default.rgw.buckets.non-ec-compress rgw

NO TE

The storage pool is created for enabling RGW data compression. The created storage pool is divided into data_pool, index_pool, and data_extra_pool, which are used in the placement policy to be created in the following operations. The default.rgw.buckets.data-compress command can also be used to create compressed data storage pools in EC mode. For details, see Creating a Storage Pool in EC Mode.

Adding a Placement Policy The Ceph object storage cluster has a default placement policy default- placement. You need to create the placement policy compress-placement for RGW data compression.

Step 1 Run the following command on ceph 1 to create the placement policy compress- placement: radosgw-admin zonegroup placement add --rgw-zonegroup=default --placement-id=compress- placement

Step 2 On ceph 1, run the following command to enter the compress-placement information, including the storage pool and compression algorithm of the placement policy: radosgw-admin zone placement add --rgw-zone=default --placement-id=compress-placement -- index_pool=default.rgw.buckets.index-compress --data_pool=default.rgw.buckets.data-compress -- data_extra_pool=default.rgw.buckets.non-ec-compress --compression=zlib

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 128 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

NO TE

"--compression=zlib" indicates that the zlib compression algorithm is used. You can also use Snappy or LZ4.

----End

Enabling the Placement Policy

Step 1 Create the admin-compress user. Run the following command on ceph 1: radosgw-admin user create --uid="admin-compress" --display-name="admin compress user"

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 129 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 2 Export user metadata. The following uses the user.json file as an example. Run the following command on ceph 1: radosgw-admin metadata get user:admin-compress > user.json

Step 3 Open the user.json metadata file. vi user.json

Change the value of default_placement to compress-placement.

Step 4 Run the following command on ceph 1 to import the modified user metadata: radosgw-admin metadata put user:admin-compress < user.json

Step 5 Restart the radosgw process. Run the following command on each storage node: systemctl restart ceph-radosgw.target

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 130 Kunpeng BoostKit for SDS 5 Ceph Object Storage Deployment Guide Deployment Guides (openEuler 20.03)

Step 6 Run the following command on ceph 1 and check whether the value of default_placement is compress-placement. If so, the placement policy takes effect. radosgw-admin user info --uid="admin-compress"

The bucket created by the admin-compress user will use the compress- placement policy to complete the compression.

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 131 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

6 Ceph File Storage Deployment Guide (CentOS 7.6)

6.1 Introduction 6.2 Environment Requirements 6.3 Configuring the Deployment Environment 6.4 Installing Ceph 6.5 Verifying Ceph

6.1 Introduction

Overview Ceph is a distributed, scalable, reliable, and high-performance storage system platform that supports storage interfaces including block devices, file systems, and object gateways. Figure 6-1 shows the Ceph architecture. This document describes how to deploy Ceph. Before installing Ceph, disable the firewall, configure the hostname, configure the time service, configure password- free login, disable SELinux, and configure the software sources. Then, run yum commands to install Ceph and deploy the Monitor (MON), Manager (MGR), and Object Storage Daemon (OSD) nodes. Finally, verify Ceph to complete the deployment.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 132 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Figure 6-1 Ceph architecture

Table 6-1 describes the modules in the preceding figure.

Table 6-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 133 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Operating System Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version

14.2.10

6.2 Environment Requirements

Hardware Requirements

Table 6-2 lists the hardware requirements.

Table 6-2 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 134 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Cores 2 x 32-core

CPU frequency 2600 MHz

Memory capacity 8 x 16 GB

Memory frequency 2933 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drives System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID controller card LSI SAS3508

NO TE

The installation of Ceph and its dependencies requires Internet access. Ensure that the server is connected to the Internet.

Software Requirements Table 6-3 lists the software requirements.

Table 6-3 Required software versions Software CentOS

OS CentOS Linux release 7.6.1810

Ceph 14.2.10 Nautilus

ceph-deploy 2.0.1

NO TE

● This document uses Ceph 14.2.10 as an example. You can also refer to this document to install other versions. ● If Ceph is installed on the OS for the first time, you are not advised to use minimum installation. Otherwise, you may need to manually install many software packages. You can select the Server with GUI installation mode.

Cluster Environment Planning Figure 6-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 135 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Figure 6-2 Physical networking diagram

Table 6-4 lists the cluster deployment plan.

Table 6-4 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 6-5 lists the client deployment plan.

Table 6-5 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 136 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

Drive Partitioning Ceph 14.2.10 uses BlueStore as the back-end storage engine. The Journal partition in the Jewel version is no longer used. Instead, the DB partition (metadata partition) and WAL partition are used. Respectively, the two partitions store the metadata and log files generated by the BlueStore back end. In cluster deployment mode, each Ceph node is configured with twelve 4 TB data drives and one 3.2 TB NVMe drive. Each 4 TB data drive functions as the data partition of one OSD, and the NVMe drive functions as the DB and WAL partitions of the 12 OSDs. Generally, the WAL partition is sufficient if its capacity is greater than 10 GB. According to the official Ceph document, it is recommended that the size of each DB partition be at least 4% of the capacity of each data drive. The size of each DB partition can be flexibly configured based on the NVMe drive capacity. In this example, the WAL partition capacity is 60 GB and the DB partition capacity is 180 GB. Table 6-6 lists the partitions of one OSD.

Table 6-6 Drive partitions Data Drive DB Partition WAL Partition

4 TB 180 GB 60 GB

6.3 Configuring the Deployment Environment

Configuring the EPEL Source On each server node and client node, run the following command to configure the Extra Packages for Enterprise Linux (EPEL) source:

yum install epel-release -y

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 137 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Disabling the Firewall

On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way.

Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file: 192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Configuring NTP

Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 138 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit.

Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 139 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

Configuring Password-Free Login Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 140 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Disabling SELinux Disable SELinux on each server node and client node. ● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 141 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Configuring the Ceph Image Source

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/yum.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

----End

6.4 Installing Ceph

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 142 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

6.4.1 Installing the Ceph Software

NO TE

When you use yum install to install Ceph, the latest version will be installed by default. In this document, the latest version is Ceph V14.2.11. If you do not want to install the latest version, you can modify the /etc/yum.conf file. For example, you can perform the following operations to install Ceph V14.2.10: 1. Open the /etc/yum.conf file. vi /etc/yum.conf 2. Add the following information to the [main] section: exclude=*14.2.11* In this case, the 14.2.11 version is filtered out, and the latest version that can be installed becomes 14.2.10. Then, when you run the yum install command, Ceph 14.2.10 is installed. 3. Run the yum list ceph command to check the available version.

Step 1 Install Ceph on each server node and client node. yum -y install ceph

Step 2 Install ceph-deploy on ceph 1. yum -y install ceph-deploy

Step 3 Run the following command on each node to check the version: ceph -v The command output is similar to the following: ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 143 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

6.4.2 Deploying MON Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Create a cluster. cd /etc/ceph ceph-deploy new ceph1 ceph2 ceph3

Step 2 Configure mon_host, public network, and cluster network in the ceph.conf file that is automatically generated in /etc/ceph. vi /etc/ceph/ceph.conf Modify the content in the ceph.conf file as follows: [global] fsid = f6b3c38c-7241-44b3-b433-52e276dd53c6 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.166,192.168.3.167,192.168.3.168 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

NO TE

● Run the command for configuring nodes and use ceph-deploy to configure the OSD in the /etc/ceph directory. Otherwise, an error may occur. ● The modification is to isolate the internal cluster network from the external access network. 192.168.4.0 is used for data synchronization between internal storage clusters (used only between storage nodes), and 192.168.3.0 is used for data exchange between storage nodes and compute nodes. Step 3 Initialize the monitor and collect the keys. ceph-deploy mon create-initial

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 144 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Step 4 Copy ceph.client.admin.keyring to each node. ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3 client1 client2 client3

Step 5 Check whether the configuration is successful. ceph -s The configuration is successful if the command output is similar to the following:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h)

----End 6.4.3 Deploying MGR Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Deploy MGR nodes. ceph-deploy mgr create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 145 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Step 2 Check whether the MGR nodes are successfully deployed. ceph -s

The MGR nodes are successfully deployed if the command output is similar to the following: cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph1(active, since 2d), standbys: ceph2, ceph3

----End 6.4.4 Deploying OSD Nodes

Creating OSD Partitions

NO TE

Perform the following operations on the three Ceph nodes. The following uses /dev/ nvme0n1 as an example. If the system has multiple NVMe SSDs or SATA/SAS SSDs, you only need to change /dev/nvme0n1 to the actual driver letters.

The NVMe SSD is divided into twelve 60 GB partitions and twelve 180 GB partitions, which correspond to the WAL and DB partitions respectively.

Step 1 Create the partition.sh script. vi partition.sh

Step 2 Add the following information to the script: #!/bin/bash

parted /dev/nvme0n1 mklabel gpt

for j in `seq 1 12` do ((b = $(( $j * 8 )))) ((a = $(( $b - 8 )))) ((c = $(( $b - 6 )))) str="%" echo $a echo $b echo $c parted /dev/nvme0n1 mkpart primary ${a}${str} ${c}${str} parted /dev/nvme0n1 mkpart primary ${c}${str} ${b}${str} done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 146 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. Step 3 Run the script. bash partition.sh

----End

Deploying OSD Nodes NO TE

In the following script, the 12 drives /dev/sda to /dev/sdl are data drives, and the OS is installed on /dev/sdm. However, if the data drives are not numbered consecutively, for example, the OS is installed on /dev/sde, you cannot run the script directly. Otherwise, an error will be reported during the deployment on /dev/sde. Instead, you need to modify the script to ensure that only data drives are operated and other drives such as the OS drive and SSD drive for DB and WAL partitions are not operated.

Step 1 Run the following command to check the drive letter of each drive on each node. lsblk

As shown in the preceding figure, /dev/sda is the OS drive. NO TE

The drives that were ever used as OS drives and data drives in a Ceph cluster may have residual partitions. You can run the lsblk command to check for the drive partitions. For example, if /dev/sdb has partitions, run the following command to clear the partitions: ceph-volume lvm zap /dev/sdb --destroy

CA UTION

You must determine the data drives first, and then run the destroy command only when the data drives have residual partitions.

Step 2 Create the create_osd.sh script on ceph 1 and deploy the OSD node on the 12 drives on each server. cd /etc/ceph/ vi /etc/ceph/create_osd.sh Add the following information to the script:

#!/bin/bash

for node in ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 147 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

do j=1 k=2 for i in {a..l} do ceph-deploy osd create ${node} --data /dev/sd${i} --block-wal /dev/nvme0n1p${j} --block-db /dev/ nvme0n1p${k} ((j=${j}+2)) ((k=${k}+2)) sleep 3 done done

NO TE

● This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. ● In the ceph-deploy osd create command: – ${node} specifies the hostname of the node. – --data specifies the data drive. – --block-db specifies the DB partition. – --block-wal specifies the WAL partition. DB and WAL partitions are usually deployed on NVMe SSDs to improve write performance. If no NVMe SSD is configured or NVMe SSDs are used as data drives, you do not need to specify --block-db and --block-wal. You only need to specify --data.

Step 3 Run the script on ceph 1. bash create_osd.sh

Step 4 Check whether all 36 OSD nodes are in the up state. ceph -s

----End

6.5 Verifying Ceph

6.5.1 Configuring MDS Nodes

The Metadata Server (MDS) manages files and directories in the CephFS cluster. Perform the following steps to configure MDS nodes:

Step 1 Create MDS nodes. Run the following commands on ceph 1: cd /etc/ceph ceph-deploy mds create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 148 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Step 2 Check whether the MDS process is successfully created on each Ceph node. ps -ef | grep ceph-mds | grep -v grep

The MDS process starts successfully if the command output is similar to the following:

ceph 64149 1 0 Nov15 ? 00:01:18 /usr/bin/ceph-mds -f --cluster ceph --id ceph4 --setuser ceph --setgroup ceph

----End 6.5.2 Creating Storage Pools and a File System

NO TE

● CephFS requires two pools to store data and metadata respectively. The following describes how to create the fs_data and fs_metadata pools. ● The two 1024 numbers in the storage pool creation command (for example, ceph osd pool create fs_data 1024 1024) correspond to the pg_num and pgp_num parameters of the storage pools. According to the official Ceph document, the recommended total number of storage pool placement groups (PGs) in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of copies. For the erasure code (EC) mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replica mode and 6 for the EC4+2 mode. ● In this example, the server is composed of three servers. Each server has 12 OSDs, and there are 36 OSDs in total. According to the preceding formula, the PG quantity is 1200. It is recommended that the PG quantity be the integral power of 2. The data volume of fs_data is much larger than other storage pools and therefore more PGs are allocated to this storage pool. The PG quantity of fs_data is 1024, and the PG quantity of fs_metadata is 128 or 256.

Step 1 Run the following commands to create storage pools on ceph 1: ceph osd pool create fs_data 1024 1024 ceph osd pool create fs_metadata 128 128

NO TE

In the preceding command, fs_data is the storage pool name, and the two 1024 numbers are the PG quantity and placement group for placement purpose (PGP) quantity. It is the same for fs_metadata.

Step 2 Create a file system based on the storage pools. ceph fs new cephfs fs_metadata fs_data

NO TE

cephfs is the file system name, and fs_metadata and fs_data are storage pool names.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 149 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Step 3 Enable zlib compression for the storage pools. ceph osd pool set fs_data compression_algorithm zlib ceph osd pool set fs_data compression_mode force ceph osd pool set fs_data compression_required_ratio .99

NO TE

This step enables OSD compression. Skip this step if OSD compression is not required. Step 4 View the created CephFS. ceph fs ls

----End 6.5.3 Mounting the File System to the Clients

Step 1 Log in to any client node and check the key used for the client node to access the Ceph cluster. cat /etc/ceph/ceph.client.admin.keyring

NO TE

You only need to run the command once. The keys for server nodes and client nodes are the same. Step 2 Create a file system mount point on each client node. mkdir /mnt/cephfs Step 3 Run the following command on each client node: mount -t ceph 192.168.3.166:6789,192.168.3.167:6789,192.168.3.168:6789:/ /mnt/cephfs -o name=admin,secret=Key obtained in step 1,sync

NO TE

The default MON port number is 6789. The -o parameter specifies the user name and password for logging in to the cluster.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 150 Kunpeng BoostKit for SDS Deployment Guides 6 Ceph File Storage Deployment Guide (CentOS 7.6)

Step 4 Check whether the file system is successfully mounted to each client node and whether the file system type is ceph. stat -f /mnt/cephfs

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 151 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

7 Ceph File Storage Deployment Guide (openEuler 20.03)

7.1 Introduction 7.2 Environment Requirements 7.3 Configuring the Deployment Environment 7.4 Installing Ceph 7.5 Verifying Ceph

7.1 Introduction

Overview Ceph is a distributed, scalable, reliable, and high-performance storage system platform that supports storage interfaces including block devices, file systems, and object gateways. Figure 7-1 shows the Ceph architecture. This document describes how to deploy Ceph. Before installing Ceph, disable the firewall, configure the hostname, configure the time service, configure password- free login, disable SELinux, and configure the software sources. Then, run yum commands to install Ceph and deploy the Monitor (MON), Manager (MGR), and Object Storage Daemon (OSD) nodes. Finally, verify Ceph to complete the deployment.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 152 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Figure 7-1 Ceph architecture

Table 7-1 describes the modules in the preceding figure.

Table 7-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 153 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Operating System Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version

14.2.10

7.2 Environment Requirements

Hardware Requirements

Table 7-2 lists the hardware requirements.

Table 7-2 Hardware requirements

Server TaiShan 200 server (model 2280)

Processor Kunpeng 920 5230 processor

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 154 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Cores 2 x 32-core

CPU frequency 2600 MHz

Memory capacity 8 x 16 GB

Memory frequency 2933 MHz

NIC Standard Ethernet card 25GE (Hi1822) four-port SFP+

Drives System drives: RAID 1 (2 x 960 GB SATA SSDs) Data drives: JBOD enabled in RAID mode (12 x 4 TB SATA HDDs)

NVMe SSD 1 x ES3000 V5 3.2 TB NVMe SSD

RAID controller card LSI SAS3508

NO TE

The installation of Ceph and its dependencies requires Internet access. Ensure that the server is connected to the Internet.

Software Requirements Table 7-3 lists the software requirements.

Table 7-3 Required software versions Software Version

OS openEuler 20.03

Ceph 14.2.10 Nautilus

ceph-deploy 2.0.1

NO TE

● This document uses Ceph 14.2.10 as an example. You can also refer to this document to install other versions. ● If Ceph is installed on the OS for the first time, you are not advised to use minimum installation. Otherwise, you may need to manually install many software packages. You can select the Server with GUI installation mode.

Cluster Environment Planning Figure 7-2 shows the physical networking.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 155 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Figure 7-2 Physical networking diagram

Table 7-4 lists the cluster deployment plan.

Table 7-4 Cluster deployment Cluster Management IP Public Network IP Cluster Network IP Address Address Address

ceph1 192.168.2.166 192.168.3.166 192.168.4.166

ceph2 192.168.2.167 192.168.3.167 192.168.4.167

ceph3 192.168.2.168 192.168.3.168 192.168.4.168

Table 7-5 lists the client deployment plan.

Table 7-5 Client deployment Client Management IP Address Service Port IP Address

client1 192.168.2.160 192.168.3.160

client2 192.168.2.161 192.168.3.161

client3 192.168.2.162 192.168.3.162

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 156 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

NO TE

● Management IP address: IP address used for remote SSH machine management and configuration. ● Cluster network IP address: IP address used for data synchronization between clusters. The 25GE network port is recommended. ● Public network IP address: IP address of the storage node for other nodes to access. The 25GE network port is recommended. ● Ensure that the service port IP addresses of clients and the public network IP address of the cluster are in the same network segment. The 25GE network port is recommended.

Drive Partitioning Ceph 14.2.10 uses BlueStore as the back-end storage engine. The Journal partition in the Jewel version is no longer used. Instead, the DB partition (metadata partition) and WAL partition are used. Respectively, the two partitions store the metadata and log files generated by the BlueStore back end. In cluster deployment mode, each Ceph node is configured with twelve 4 TB data drives and one 3.2 TB NVMe drive. Each 4 TB data drive functions as the data partition of one OSD, and the NVMe drive functions as the DB and WAL partitions of the 12 OSDs. Generally, the WAL partition is sufficient if its capacity is greater than 10 GB. According to the official Ceph document, it is recommended that the size of each DB partition be at least 4% of the capacity of each data drive. The size of each DB partition can be flexibly configured based on the NVMe drive capacity. In this example, the WAL partition capacity is 60 GB and the DB partition capacity is 180 GB. Table 7-6 lists the partitions of one OSD.

Table 7-6 Drive partitions Data Drive DB Partition WAL Partition

4 TB 180 GB 60 GB

7.3 Configuring the Deployment Environment

Configuring the EPEL Source Perform the following operations on each server node and client node to configure the Extra Packages for Enterprise Linux (EPEL) source:

Step 1 Upload the everything image source file corresponding to the OS to the server. Use the SFTP tool to upload the openEuler-***-everything-aarch64-dvd.iso package to the /root directory on the server. Step 2 Create a local folder to mount the image. mkdir -p /iso Step 3 Mount the ISO file to the local directory. mount /root/openEuler-***-everything-aarch64-dvd.iso /iso

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 157 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 4 Create a Yum source for the image. vi /etc/yum.repos.d/openEuler.repo

Add the following information to the file: [Base] name=Base baseurl=file:///iso enabled=1 gpgcheck=0 priority=1

Add an external image source. [arch_fedora_online] name=arch_fedora baseurl=https://mirrors.huaweicloud.com/fedora/releases/30/Everything/aarch64/os/ enabled=1 gpgcheck=0 priority=2

----End

Disabling the Firewall On each server node and client node, run the following commands in sequence to disable the firewall:

systemctl stop firewalld systemctl disable firewalld systemctl status firewalld

Configuring Hostnames

Step 1 Configure static hostnames, for example, configure ceph 1 to ceph 3 for server nodes and client 1 to client 3 for client nodes. 1. Configure hostnames for server nodes. Set the hostname of server node 1 to ceph 1: hostnamectl --static set-hostname ceph1

Set hostnames for other server nodes in the same way. 2. Set hostnames for client nodes. Set the hostname of client node 1 to client 1: hostnamectl --static set-hostname client1 Set hostnames for other client nodes in the same way. Step 2 Modify the domain name resolution file. vi /etc/hosts

On each server node and client node, add the following information to the /etc/ hosts file:

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 158 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

192.168.3.166 ceph1 192.168.3.167 ceph2 192.168.3.168 ceph3 192.168.3.160 client1 192.168.3.161 client2 192.168.3.162 client3

----End

Configuring NTP Ceph automatically checks the time of storage nodes. If a large time difference is detected, an alarm will be generated. To prevent the time difference between storage nodes, perform the following steps:

Step 1 Install and configure the Network Time Protocol (NTP) service. 1. Install the NTP service on each server node and client node. yum -y install ntp ntpdate

2. Back up the existing configuration on each server node and client node. cd /etc && mv ntp.conf ntp.conf.bak 3. Create an NTP file on ceph 1, which serves as the NTP server. vi /etc/ntp.conf Add the following NTP server configuration to the NTP file: restrict 127.0.0.1 restrict ::1 restrict 192.168.3.0 mask 255.255.255.0 server 127.127.1.0 fudge 127.127.1.0 stratum 8

NO TE

restrict 192.168.3.0 mask 255.255.255.0 // ceph 1 network segment and subnet mask 4. Create an NTP file on ceph 2, ceph 3, and all client nodes. vi /etc/ntp.conf Add the following content to the NTP files so that ceph 2, ceph 3, and all client nodes function as NTP clients: server 192.168.3.166 5. Save the settings and exit. Step 2 Start the NTP service. 1. Start the NTP service on ceph 1 and check the service status. systemctl start ntpd systemctl enable ntpd systemctl status ntpd

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 159 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

2. Run the following command on all nodes except ceph 1 to forcibly synchronize the NTP server (ceph 1) time to all the other nodes: ntpdate ceph1 3. Write the hardware clock to all nodes except ceph 1 to prevent configuration failures after the restart. hwclock -w 4. Install and start the crontab tool on all nodes except ceph 1. yum install -y crontabs chkconfig crond on systemctl start crond crontab -e 5. Add the following information so that all nodes except ceph 1 can automatically synchronize time with ceph 1 every 10 minutes: */10 * * * * /usr/sbin/ntpdate 192.168.3.166

----End

Configuring Password-Free Login Enable ceph 1 and client 1 to access all server and client nodes (including ceph 1 and client 1) without a password.

Step 1 Generate a public key on ceph 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 160 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 2 Generate a public key on client 1 and send the public key to each server node and client node. ssh-keygen -t rsa for i in {1..3}; do ssh-copy-id ceph$i; done for i in {1..3}; do ssh-copy-id client$i; done

NO TE

After entering the first command ssh-keygen -t rsa, press Enter to use the default configuration.

----End

Disabling SELinux

Disable SELinux on each server node and client node.

● Temporarily disable SELinux. The configuration becomes invalid after the system restarts. setenforce 0

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 161 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

● Permanently disable SELinux. The configuration takes effect after the system restarts. vi /etc/selinux/config Set SELINUX to disabled.

Configuring the Ceph Image Source

Step 1 Create the ceph.repo file on each server node and client node. vi /etc/yum.repos.d/ceph.repo Add the following information to the file:

[Ceph] name=Ceph packages for $basearch baseurl=http://download.ceph.com/rpm-nautilus/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[Ceph-noarch] name=Ceph noarch packages baseurl=http://download.ceph.com/rpm-nautilus/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1

[ceph-source] name=Ceph source packages baseurl=http://download.ceph.com/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc priority=1 Step 2 Update the Yum source. yum clean all && yum makecache

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 162 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

----End

Changing the umask Value Change the value of umask to 0022 so that Ceph can be properly installed.

Step 1 Open the bashrc file in all clusters. vi /etc/bashrc Change the last line to umask 0022. Step 2 Make the configuration take effect. source /etc/bashrc Step 3 Run the following command to check whether the configuration has taken effect: umask

----End

7.4 Installing Ceph

7.4.1 Installing the Ceph Software

NO TE

When you use yum install to install Ceph, the latest version will be installed by default. In this document, the latest version is Ceph V14.2.11. If you do not want to install the latest version, you can modify the /etc/yum.conf file. For example, you can perform the following operations to install Ceph V14.2.10: 1. Open the /etc/yum.conf file. vi /etc/yum.conf 2. Add the following information to the [main] section: exclude=*14.2.11* In this case, the 14.2.11 version is filtered out, and the latest version that can be installed becomes 14.2.10. Then, when you run the yum install command, Ceph 14.2.10 is installed. 3. Run the yum list ceph command to check the available version.

Step 1 Install Ceph on each cluster node and client node.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 163 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

dnf -y install ceph

Step 2 Install ceph-deploy on ceph 1. pip install ceph-deploy pip install prettytable

Step 3 Add a line of code to the _get_distro function in the /lib/python2.7/site- packages/ceph_deploy/hosts/__init__.py file to adapt the software to the openEuler system. vi /lib/python2.7/site-packages/ceph_deploy/hosts/__init__.py 'openeuler': fedora,

Step 4 Run the following command on each node to check the version: ceph -v The command output is similar to the following:

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 164 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)

----End 7.4.2 Deploying MON Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Create a cluster. cd /etc/ceph ceph-deploy new ceph1 ceph2 ceph3

Step 2 Configure mon_host, public network, and cluster network in the ceph.conf file that is automatically generated in /etc/ceph. vi /etc/ceph/ceph.conf

Modify the content in the ceph.conf file as follows: [global] fsid = f6b3c38c-7241-44b3-b433-52e276dd53c6 mon_initial_members = ceph1, ceph2, ceph3 mon_host = 192.168.3.166,192.168.3.167,192.168.3.168 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx

public_network = 192.168.3.0/24 cluster_network = 192.168.4.0/24

[mon] mon_allow_pool_delete = true

NO TE

● Run the command for configuring nodes and use ceph-deploy to configure the OSD in the /etc/ceph directory. Otherwise, an error may occur. ● The modification is to isolate the internal cluster network from the external access network. 192.168.4.0 is used for data synchronization between internal storage clusters (used only between storage nodes), and 192.168.3.0 is used for data exchange between storage nodes and compute nodes.

Step 3 Initialize the monitor and collect the keys. ceph-deploy mon create-initial

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 165 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 4 Copy ceph.client.admin.keyring to each node. ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3 client1 client2 client3

Step 5 Check whether the configuration is successful. ceph -s The configuration is successful if the command output is similar to the following:

cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h)

----End 7.4.3 Deploying MGR Nodes

NO TE

Perform operations in this section only on ceph 1.

Step 1 Deploy MGR nodes. ceph-deploy mgr create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 166 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 2 Check whether the MGR nodes are successfully deployed. ceph -s

The MGR nodes are successfully deployed if the command output is similar to the following: cluster: id: f6b3c38c-7241-44b3-b433-52e276dd53c6 health: HEALTH_OK

services: mon: 3 daemons, quorum ceph1,ceph2,ceph3 (age 25h) mgr: ceph1(active, since 2d), standbys: ceph2, ceph3

----End 7.4.4 Deploying OSD Nodes

Creating OSD Partitions

NO TE

Perform the following operations on the three Ceph nodes. The following uses /dev/ nvme0n1 as an example. If the system has multiple NVMe SSDs or SATA/SAS SSDs, you only need to change /dev/nvme0n1 to the actual driver letters.

The NVMe SSD is divided into twelve 60 GB partitions and twelve 180 GB partitions, which correspond to the WAL and DB partitions respectively.

Step 1 Create the partition.sh script. vi partition.sh

Step 2 Add the following information to the script: #!/bin/bash

parted /dev/nvme0n1 mklabel gpt

for j in `seq 1 12` do ((b = $(( $j * 8 )))) ((a = $(( $b - 8 )))) ((c = $(( $b - 6 )))) str="%" echo $a echo $b echo $c parted /dev/nvme0n1 mkpart primary ${a}${str} ${c}${str} parted /dev/nvme0n1 mkpart primary ${c}${str} ${b}${str} done

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 167 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

NO TE

This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. Step 3 Run the script. bash partition.sh

----End

Deploying OSD Nodes NO TE

In the following script, the 12 drives /dev/sda to /dev/sdl are data drives, and the OS is installed on /dev/sdm. However, if the data drives are not numbered consecutively, for example, the OS is installed on /dev/sde, you cannot run the script directly. Otherwise, an error will be reported during the deployment on /dev/sde. Instead, you need to modify the script to ensure that only data drives are operated and other drives such as the OS drive and SSD drive for DB and WAL partitions are not operated.

Step 1 Run the following command to check the drive letter of each drive on each node. lsblk

As shown in the preceding figure, /dev/sda is the OS drive. NO TE

The drives that were ever used as OS drives and data drives in a Ceph cluster may have residual partitions. You can run the lsblk command to check for the drive partitions. For example, if /dev/sdb has partitions, run the following command to clear the partitions: ceph-volume lvm zap /dev/sdb --destroy

CA UTION

You must determine the data drives first, and then run the destroy command only when the data drives have residual partitions.

Step 2 Create the create_osd.sh script on ceph 1 and deploy the OSD node on the 12 drives on each server. cd /etc/ceph/ vi /etc/ceph/create_osd.sh Add the following information to the script:

#!/bin/bash

for node in ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 168 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

do j=1 k=2 for i in {a..l} do ceph-deploy osd create ${node} --data /dev/sd${i} --block-wal /dev/nvme0n1p${j} --block-db /dev/ nvme0n1p${k} ((j=${j}+2)) ((k=${k}+2)) sleep 3 done done

NO TE

● This script applies only to the current hardware configuration. For other hardware configurations, you need to modify the script. ● In the ceph-deploy osd create command: – ${node} specifies the hostname of the node. – --data specifies the data drive. – --block-db specifies the DB partition. – --block-wal specifies the WAL partition. DB and WAL partitions are usually deployed on NVMe SSDs to improve write performance. If no NVMe SSD is configured or NVMe SSDs are used as data drives, you do not need to specify --block-db and --block-wal. You only need to specify --data.

Step 3 Run the script on ceph 1. bash create_osd.sh

Step 4 Check whether all 36 OSD nodes are in the up state. ceph -s

----End

7.5 Verifying Ceph

7.5.1 Configuring MDS Nodes

The Metadata Server (MDS) manages files and directories in the CephFS cluster. Perform the following steps to configure MDS nodes:

Step 1 Create MDS nodes. Run the following commands on ceph 1: cd /etc/ceph ceph-deploy mds create ceph1 ceph2 ceph3

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 169 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 2 Check whether the MDS process is successfully created on each Ceph node. ps -ef | grep ceph-mds | grep -v grep

The MDS process starts successfully if the command output is similar to the following:

ceph 64149 1 0 Nov15 ? 00:01:18 /usr/bin/ceph-mds -f --cluster ceph --id ceph4 --setuser ceph --setgroup ceph

----End 7.5.2 Creating Storage Pools and a File System

NO TE

● CephFS requires two pools to store data and metadata respectively. The following describes how to create the fs_data and fs_metadata pools. ● The two 1024 numbers in the storage pool creation command (for example, ceph osd pool create fs_data 1024 1024) correspond to the pg_num and pgp_num parameters of the storage pools. According to the official Ceph document, the recommended total number of storage pool placement groups (PGs) in a cluster is calculated as follows: (Number of OSDs x 100)/Data redundancy factor. For the replication mode, the data redundancy factor is the number of copies. For the erasure code (EC) mode, the data redundancy factor is the sum of data blocks and parity blocks. For example, the data redundancy factor is 3 for the three-replica mode and 6 for the EC4+2 mode. ● In this example, the server is composed of three servers. Each server has 12 OSDs, and there are 36 OSDs in total. According to the preceding formula, the PG quantity is 1200. It is recommended that the PG quantity be the integral power of 2. The data volume of fs_data is much larger than other storage pools and therefore more PGs are allocated to this storage pool. The PG quantity of fs_data is 1024, and the PG quantity of fs_metadata is 128 or 256.

Step 1 Run the following commands to create storage pools on ceph 1: ceph osd pool create fs_data 1024 1024 ceph osd pool create fs_metadata 128 128

NO TE

In the preceding command, fs_data is the storage pool name, and the two 1024 numbers are the PG quantity and placement group for placement purpose (PGP) quantity. It is the same for fs_metadata.

Step 2 Create a file system based on the storage pools. ceph fs new cephfs fs_metadata fs_data

NO TE

cephfs is the file system name, and fs_metadata and fs_data are storage pool names.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 170 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 3 Enable zlib compression for the storage pools. ceph osd pool set fs_data compression_algorithm zlib ceph osd pool set fs_data compression_mode force ceph osd pool set fs_data compression_required_ratio .99

NO TE

This step enables OSD compression. Skip this step if OSD compression is not required. Step 4 View the created CephFS. ceph fs ls

----End 7.5.3 Mounting the File System to the Clients

Step 1 Log in to any client node and check the key used for the client node to access the Ceph cluster. cat /etc/ceph/ceph.client.admin.keyring

NO TE

You only need to run the command once. The keys for server nodes and client nodes are the same. Step 2 Create a file system mount point on each client node. mkdir /mnt/cephfs Step 3 Run the following command on each client node: mount -t ceph 192.168.3.166:6789,192.168.3.167:6789,192.168.3.168:6789:/ /mnt/cephfs -o name=admin,secret=Key obtained in step 1,sync

NO TE

The default MON port number is 6789. The -o parameter specifies the user name and password for logging in to the cluster.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 171 Kunpeng BoostKit for SDS 7 Ceph File Storage Deployment Guide (openEuler Deployment Guides 20.03)

Step 4 Check whether the file system is successfully mounted to each client node and whether the file system type is ceph. stat -f /mnt/cephfs

----End

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 172 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

8 Ceph Automatic Deployment Guide (CentOS 7.6)

8.1 Introduction 8.2 Configuring the Deployment Environment 8.3 Installing Ceph

8.1 Introduction

Overview Ceph is a distributed storage system designed for high performance, reliability, and scalability. It provides 3-in-1 interfaces for object-, block- and file-level storage. Figure 8-1 shows the Ceph architecture.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 173 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

Figure 8-1 Ceph architecture

Table 8-1 describes the modules of Ceph.

Table 8-1 Module functions Module Function

RADOS Reliable Autonomic Distributed Object Store (RADOS) is the heart of a Ceph storage cluster. Everything in Ceph is stored by RADOS in the form of objects irrespective of their data types. The RADOS layer ensures data consistency and reliability through data replication, fault detection and recovery, and data recovery across cluster nodes.

OSD Object storage daemons (OSDs) store the actual user data. Every OSD is usually bound to one physical drive. The OSDs handle the read/write requests from clients.

MON The monitor (MON) is the most important component in a Ceph cluster. It manages the Ceph cluster and maintains the status of the entire cluster. The MON ensures that related components of a cluster can be synchronized at the same time. It functions as the leader of the cluster and is responsible for collecting, updating, and publishing cluster information. To avoid single points of failure (SPOFs), multiple MONs are deployed in a Ceph environment, and they must handle the collaboration between them.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 174 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

Module Function

MGR The manager (MGR) is a monitoring system that provides collection, storage, analysis (including alarming), and visualization functions. It makes certain cluster parameters available for external systems.

Librados Librados is a method that simplifies access to RADOS. Currently, it supports programming languages PHP, Ruby, Java, Python, C, and C++. It provides RADOS, a local interface of the Ceph storage cluster, and is the base component of other services such as the RADOS block device (RBD) and RADOS gateway (RGW). In addition, it provides the Portable Operating System Interface (POSIX) for the Ceph file system (CephFS). The Librados API can be used to directly access RADOS, enabling developers to create their own interfaces for accessing the Ceph cluster storage.

RBD The RADOS block device (RBD) is the Ceph block device that provides block storage for external systems. It can be mapped, formatted, and mounted like a drive to a server.

RGW The RADOS gateway (RGW) is a Ceph object gateway that provides RESTful APIs compatible with S3 and Swift. The RGW also supports multi-tenant and OpenStack Identity service (Keystone).

MDS The Ceph Metadata Server (MDS) tracks the file hierarchy and stores metadata used only for CephFS. The RBD and RGW do not require metadata. The MDS does not directly provide data services for clients.

CephFS The CephFS provides a POSlX-compatible distributed file system of any size. It depends on the Ceph MDS to track the file hierarchy, namely the metadata.

Recommended Version 14.2.1

8.2 Configuring the Deployment Environment

8.2.1 Configuring the Physical Environment Prepare the servers and clients, and complete the physical networking of the physical servers based on the networking plan.

NO TE

In this document, three clients and three TaiShan servers are used. In this deployment process, three client nodes and three TaiShan 200 servers (model 2280) are included. The TaiShan 200 servers are used as storage nodes.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 175 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

The networks include the front-end network (public network) and back-end network (cluster network). Figure 8-2 shows a typical physical Ceph network.

Figure 8-2 Physical networking

8.2.2 Configuring Software

OS Configuration Install the driver for each NEC and configure the NIC IP address on the management NIC.

NO TE

In this document, the Hi1822 NICs are used. Step 1 Ensure that the NIC drivers are properly installed. hinicadm info

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 176 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

Step 2 Ensure that the management network IP address has been configured. ip add show

----End

Obtaining Ceph Installation and Configuration Information

Step 1 Run the lsblk command on the servers to obtain the drive letter names and drive sizes. lsblk ● As shown in the figure below, the data drive letters are "sda", "sdb", "sdc", "sdd", "sde", "sdf", "sdg", "sdh", "sdi", "sdk", and "sdl", and the drive size is 3.7 TB. When setting installation parameters in the ceph_input file (in 8.2.3 Setting Installation Parameters), convert 3.7 TB to 3700 GB. ● Obtain information about the SSDs that need to be used as DB and WAL drives. The drive letters are nvme0n1 and nvme1n1, and the drive size is 2.9 TB. When setting installation parameters in the ceph_input file (in 8.2.3 Setting Installation Parameters), convert 2.9 TB to 2900 GB.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 177 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

NO TE

If you do not need to configure DB and WAL disks separately, skip the corresponding steps. Step 2 Obtain the names of the NICs for which the service network needs to be configured on a node. ip add show

----End 8.2.3 Setting Installation Parameters Automation tool: ceph-ansible-autotool.zip Download URL: https://mirrors.huaweicloud.com/kunpeng/archive/ kunpeng_solution/storage/Tools/ Open the ceph_input.xlsx configuration file in the storage/input directory for the automation tool, and set parameters in the Ceph Configuration, Server Configuration, and Client Configuration sheets.

NO TE

For details about the parameters, see the description of each parameter in the Excel file. If you do not need to install the client, clear the Client Configuration sheet. ● Ceph configuration example

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 178 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

● Server configuration example

● Client configuration example

8.2.4 Verifying the Installation Parameters

Step 1 Download the entire installation suite and upload it to a server where Ceph is to be installed. Step 2 Decompress the package to any directory. Step 3 Replace the content of the Excel file in the input folder in the decompression directory with the configuration information entered in 8.2.3 Setting Installation Parameters. Step 4 Check whether the parameters are correctly set. sh input_check.sh

----End

8.3 Installing Ceph

Deploying Ceph

NO TE

By default, Ceph 14.2.1 is installed by the Ceph automatic installation suite. To change the version number, modify the yum_exclude field in the scripts/conf/manual.cfg file. If a proxy needs to be configured, modify the config_proxy and config_no_proxy fields in scripts/conf/manual.cfg.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 179 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

Step 1 Run the installation script to install Ceph. sh install_all.sh

Step 2 View the "ceph install" log at the bottom of the logs to check whether the installation is successful. If "ok" is displayed, the installation is successful, as shown in the figure below.

----End

Verifying the Ceph Deployment Log in to a server where the Ceph server is installed, and check whether the installed components are the same as those in the configuration file.

ceph -s

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 180 Kunpeng BoostKit for SDS Deployment Guides 8 Ceph Automatic Deployment Guide (CentOS 7.6)

Installing the Clients Log in to a server where the Ceph server is installed, and copy the ceph.client.admin.keyring file from the /etc/ceph directory to each client node.

cd /etc/ceph scp ceph.client.admin.keyring root@Client hostname:/etc/ceph/

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 181 Kunpeng BoostKit for SDS Deployment Guides A Change History

A Change History

Date Description

2021-03-23 This issue is the fifth official release. Changed the solution name from "Kunpeng SDS solution" to "Kunpeng BoostKit for SDS".

2020-12-09 This issue is the fourth official release. Added the guide for manually creating the repo source compression package offline in the 1 Ceph-Ansible Deployment Guide (CentOS 7.6).

2020-09-27 This issue is the third official release. ● Divided the Ceph Block Storage Installation Guide (CentOS 7.6 & openEuler 20.03) into 2 Ceph Block Storage Deployment Guide (CentOS 7.6) and 3 Ceph Block Storage Deployment Guide (openEuler 20.03) based on the OS. ● Divided the Ceph Object Storage Installation Guide (CentOS 7.6 & openEuler 20.03) into 4 Ceph Object Storage Deployment Guide (CentOS 7.6) and 5 Ceph Object Storage Deployment Guide (openEuler 20.03) based on the OS. ● Divided the Ceph File Storage Installation Guide (CentOS 7.6 & openEuler 20.03) into 6 Ceph File Storage Deployment Guide (CentOS 7.6) and 7 Ceph File Storage Deployment Guide (openEuler 20.03) based on the OS.

2020-05-09 This issue is the second official release. Modified 1.5 Configuring Block Storage, 1.6 Configuring File Storage, and 1.7 Configuring Object Storage in the Ceph-Ansible Deployment Guide (CentOS 7.6).

2020-03-20 This issue is the first official release.

Issue 05 (2021-03-23) Copyright © Huawei Technologies Co., Ltd. 182