Glusterfs Documentation Release 3.8.0
Total Page:16
File Type:pdf, Size:1020Kb
GlusterFS Documentation Release 3.8.0 Gluster Community Aug 10, 2016 Contents 1 Quick Start Guide 3 1.1 Single Node Cluster...........................................3 1.2 Multi Node Cluster............................................4 2 Overview and Concepts 7 2.1 Volume Types..............................................7 2.2 FUSE................................................... 10 2.3 Translators................................................ 12 2.4 Geo-Replication............................................. 17 2.5 Terminologies.............................................. 19 3 Installation Guide 23 3.1 Getting Started.............................................. 23 3.2 Configuration............................................... 24 3.3 Installing Gluster............................................. 26 3.4 Overview................................................. 27 3.5 Quick Start Guide............................................ 28 3.6 Setup Baremetal............................................. 29 3.7 Deploying in AWS............................................ 30 3.8 Setting up in virtual machines...................................... 31 4 Administrator Guide 33 5 Upgrade Guide 35 6 Contributors Guide 37 6.1 Adding your blog............................................. 37 6.2 Bug Lifecycle.............................................. 37 6.3 Bug Reporting Guidelines........................................ 38 6.4 Bug Triage Guidelines.......................................... 41 7 Changelog 47 8 Presentations 49 i ii GlusterFS Documentation, Release 3.8.0 GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed storage solutions for media streaming, data analysis, and other data and bandwidth-intensive tasks. GlusterFS is free and open source software. Contents 1 GlusterFS Documentation, Release 3.8.0 2 Contents CHAPTER 1 Quick Start Guide For this tutorial, we will assume you are using Fedora 22 (or later) virtual machine(s). If you would like a more detailed walk through with instructions for installing using different methods (in local virtual machines, EC2 and baremetal) and different distributions, then have a look at the Install Guide. 1.1 Single Node Cluster This is to demonstrate installation and setting up of GlusterFS in under five minutes. You would not want to do this in any real-world scenario. Install glusterfs client and server packages: # yum install glusterfs glusterfs-server glusterfs-fuse Start glusterd service: # service glusterd start Create 4 loopback devices to be consumed as bricks. This exercise is to simulate 4 hard disks in 4 different nodes. Format the loopback devices with XFS filesystem and mount it. # truncate -s 1GB /srv/disk{1..4} # for i in `seq 1 4`;do mkfs.xfs -i size=512 /srv/disk$i ;done # mkdir -p /export/brick{1..4} # for i in `seq 1 4`;do echo "/srv/disk$i /export/brick$i xfs ,!loop,inode64,noatime,nodiratime 0 0" >> /etc/fstab ;done # mount -a Create a 2x2 Distributed-Replicated volume and start it: # gluster volume create test replica 2 transport tcp `hostname`:/export/brick{1..4}/ ,!data force # gluster volume start test Mount the volume for consumption: # mkdir /mnt/test # mount -t glusterfs `hostname`:test /mnt/test For illustration, create 10 empty files and see how it gets distributed and replicated among the 4 bricks that make up the volume. 3 GlusterFS Documentation, Release 3.8.0 # touch /mnt/test/file{1..10} # ls /mnt/test # tree /export/ 1.2 Multi Node Cluster 1.2.1 Step 1 – Have at least two nodes • Fedora 22 (or later) on two nodes named “server1” and “server2” • A working network connection • At least two virtual disks, one for the OS installation, and one to be used to serve GlusterFS storage (sdb). This will emulate a real world deployment, where you would want to separate GlusterFS storage from the OS install. 1.2.2 Step 2 - Format and mount the bricks (on both nodes): Note: These examples are going to assume the brick is going to reside on /dev/sdb1. # mkfs.xfs -i size=512 /dev/sdb1 # mkdir -p /data/brick1 # echo '/dev/sdb1 /data/brick1 xfs defaults 1 2' >> /etc/fstab # mount -a && mount You should now see sdb1 mounted at /data/brick1 1.2.3 Step 3 - Installing GlusterFS (on both servers) Install the software # yum install glusterfs-server Start the GlusterFS management daemon: # service glusterd start # service glusterd status 1.2.4 Step 4 - Configure the trusted pool From “server1” # gluster peer probe server2 Note: When using hostnames, the first server needs to be probed from one other server to set its hostname. From “server2” # gluster peer probe server1 Note: Once this pool has been established, only trusted members may probe new servers into the pool. A new server cannot probe the pool, it must be probed from the pool. 4 Chapter 1. Quick Start Guide GlusterFS Documentation, Release 3.8.0 1.2.5 Step 5 - Set up a GlusterFS volume From any single server: # gluster volume create gv0 replica 2 server1:/data/brick1/gv0 server2:/data/brick1/ ,!gv0 # gluster volume start gv0 Confirm that the volume shows “Started”: # gluster volume info Note: If the volume is not started, clues as to what went wrong will be in log files under /var/log/glusterfs on one or both of the servers - usually in etc-glusterfs-glusterd.vol.log 1.2.6 Step 6 - Testing the GlusterFS volume For this step, we will use one of the servers to mount the volume. Typically, you would do this from an external machine, known as a “client”. Since using this method would require additional packages to be installed on the client machine, we will use one of the servers as a simple place to test first, as if it were that “client”. # mount -t glusterfs server1:/gv0 /mnt # for i in `seq -w 1 100`; do cp -rp /var/log/messages /mnt/copy-test-$i; done First, check the mount point: # ls -lA /mnt | wc -l You should see 100 files returned. Next, check the GlusterFS mount points on each server: # ls -lA /data/brick1/gv0 You should see 100 files on each server using the method we listed here. Without replication, in a distribute only volume (not detailed here), you should see about 50 files on each one. 1.2. Multi Node Cluster 5 GlusterFS Documentation, Release 3.8.0 6 Chapter 1. Quick Start Guide CHAPTER 2 Overview and Concepts 2.1 Volume Types Volume is the collection of bricks and most of the gluster file system operations happen on the volume. Gluster file system supports different types of volumes based on the requirements. Some volumes are good for scaling storage size, some for improving performance and some for both. Distributed Volume - This is the default glusterfs volume i.e, while creating a volume if you do not specify the type of the volume, the default option is to create a distributed volume. Here, files are distributed across various bricks in the volume. So file1 may be stored only in brick1 or brick2 but not on both. Hence there is no data redundancy. The purpose for such a storage volume is to easily & cheaply scale the volume size. However this also means that a brick failure will lead to complete loss of data and one must rely on the underlying hardware for data loss protection. Create a Distributed Volume gluster volume create NEW-VOLNAME [transport [tcp | rdma | tcp,rdma]] NEW-BRICK... For example, to create a distributed volume with four storage servers using TCP. 7 GlusterFS Documentation, Release 3.8.0 # gluster volume create test-volume server1:/exp1 server2:/exp2 server3:/exp3 server4: ,!/exp4 Creation of test-volume has been successful Please start the volume to access data To display the volume info # gluster volume info Volume Name: test-volume Type: Distribute Status: Created Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: server1:/exp1 Brick2: server2:/exp2 Brick3: server3:/exp3 Brick4: server4:/exp4 Replicated Volume - In this volume we overcome the data loss problem faced in the distributed volume. Here exact copies of the data are maintained on all bricks. The number of replicas in the volume can be decided by client while creating the volume. So we need to have at least two bricks to create a volume with 2 replicas or a minimum of three bricks to create a volume of 3 replicas. One major advantage of such a volume is that even if one brick fails the data can still be accessed from its replicated bricks. Such a volume is used for better reliability and data redundancy. Create a Replicated Volume gluster volume create NEW-VOLNAME [replica COUNT] [transport [tcp | rdma | tcp,rdma]] NEW-BRICK... For example, to create a replicated volume with two storage servers: # gluster volume create test-volume replica 2 transport tcp server1:/exp1 server2:/ ,!exp2 Creation of test-volume has been successful Please start the volume to access data 8 Chapter 2. Overview and Concepts GlusterFS Documentation, Release 3.8.0 Distributed Replicated Volume - In this volume files are distributed across replicated sets of bricks. The number of bricks must be a multiple of the replica count. Also the order in which we specify the bricks matters since adjacent bricks become replicas of each other. This type of volume is used when high availability of data due to redundancy and scaling storage is required. So if there were eight bricks and replica count 2 then the first two bricks become replicas of each other then the next two and so on. This volume is denoted as 4x2. Similarly if there were eight bricks and replica count 4 then four bricks become replica of each other and we denote this volume as 2x4 volume. Create the distributed replicated volume: # gluster volume create NEW-VOLNAME [replica COUNT] [transport [tcp | rdma | tcp,rdma]] NEW-BRICK... For example, four node distributed (replicated) volume with a two-way mirror: # gluster volume create test-volume replica 2 transport tcp server1:/exp1 server2:/ ,!exp2 server3:/exp3 server4:/exp4 Creation of test-volume has been successful Please start the volume to access data Striped Volume - Consider a large file being stored in a brick which is frequently accessed by many clients at the same time.