OCFS2 Support Guide - Cluster Start and Stop

Introduction

This support guide is a supplement to the OCFS2 User's Guide and the OCFS2 FAQ. The information provided is directed towards support. End users should consult the Users' Guide and/or FAQ for information on setting up and using OCFS2.

Install

The OCFS2 product is distributed by the kernel group in RPM form. Customers tend to have difficulty determining which of the shipped packages they should install.

For the ocfs2-tools and ocfs2console products, we provide one rpm file for a given vendor distribution and architecture (eg. RHEL4/i386). These packages are only updated as needed when bug fixes are made to the tools or console.

For the ocfs2 kernel module, we provide multiple kernel module packages for each distribution and architecture.

Kernel modules are tied to a specific kernel version. Unlike in RHEL3, neither Red Hat nor SuSE provides kernel ABI compatibility across releases in the 2.6 based kernels, RHEL4 and SLES9. Module vendors have to build and ship modules for each new kernel release. Refer to the FAQ in order to determine the kernel version the package is built for. Download the appropriate packages and install.

Loading the modules

Once the proper modules and tools are installed, the first error frequently encountered by users is that the "modules are not loading." This means that either the correct ones have not been installed or the modules are not being located by modprobe.

# rpm -qa | grep ocfs2 ocfs2-2.6.9-22.0.1.ELsmp-1.2.1-1 ocfs2console-1.2.1-1 ocfs2-tools-1.2.1-1

# uname -a Linux node01 2.6.9-22.0.1.ELsmp #1 SMP Fri Dec 2 15:52:15 PST 2005 i686 i686 i386 GNU/Linux ocfs2-2.6.9-22.0.1.ELsmp-1.2.1-1 is the kernel module rpm. Ensure it matches the kernel version. If it does, the correct ones have been installed.

To check whether the modules are being located or not, we first need to ensure the files exist. To list the contents of the package, do:

# rpm -ql ocfs2-2.6.9-22.0.1.ELsmp-1.2.1-1 /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/configfs/configfs.ko /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/debugfs/debugfs.ko /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ocfs2 /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ocfs2/ocfs2.ko /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ocfs2/ocfs2_dlm.ko /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ocfs2/ocfs2_dlmfs.ko /lib/modules/2.6.9-22.0.1.ELsmp/kernel/fs/ocfs2/ocfs2_nodemanager.ko

Use find, ls, or any other tool, to ensure all the files exist. Also check whether the same module(s) exists in a different location under "/lib/modules/`uname -r`" directory. modprobe will pick up the first matching module it finds and it may not be the one we want it to load. If you find a matching module in another directory, find the package that module was bundled with.

# rpm -qf /path/to/rogue/module

If it lists a package name, investigate how a matching module could be shipped in an another package. "rpm --erase " may be inorder.

On the other hand, if that comes up empty, then one has to assume it was hand copied. Either move it or delete it.

If the files are ok, the next step is to rebuild the module dependencies.

# depmod -a

Try to load the modules again. If it still fails attempt to do the various loads manually.

# modprobe configfs # mount -t configfs configfs /config # modprobe ocfs2_nodemanager # modprobe ocfs2_dlm # modprobe ocfs2_dlmfs # mount -t ocfs2_dlmfs ocfs2_dlmfs /dlm

It just could be that /bin is not in the PATH. As always, keep checking dmesg for any errors. If the error is originating in kernel space, a more descriptive message will be listed in dmesg.

Starting the Cluster

OCFS2 bundles a cluster stack referred to as O2CB. One needs to start O2CB before one can mount OCFS2 volumes. /etc/init.d/o2cb is the init script which is used to start and stop the cluster.

However, before the cluster can be started, /etc/ocfs2/cluster.conf needs to be populated with the list of all nodes in the cluster. Also, the default cluster name needs to be specified in /etc/sysconfig/o2cb. This can be done by hand or by running "/etc/init.d/o2cb configure". Refer to the user's guide for the steps to populate cluster.conf.

The common errors include specifying incorrect IP addresses and node names. While a public IP can be used, it is not recommended. One should use private IPs for security and for lower latency. The node name must match the hostname for which that node entry is for.

Once populated, one should ensure that the file is the same on all nodes in the the cluster. An easy check is comparing the md5 checksums of the file on each node.

# md5sum /etc/ocfs2/cluster.conf

The checksum should be the same on all the nodes. Eventhough O2CB only expects the config file to be semantically the same and not necessarily have matching checksums, having the exact same file on all nodes makes cluster management easier.

To start the cluster, do:

# /etc/init.d/o2cb online

The o2cb init script calls o2cb_ctl to read cluster.conf and populate /config/cluster//node with the information. When ocfs2_ctl detects a nodename in the config file matching the hostname of the node, it populates 1 in /config/cluster//node//local. The nodemanager in kernel, inturn, launches the [o2net] thread.

The is cluster now online.

Keep in mind the nodemanager's view of the cluster members is only what is populated in /config/cluster//node. If one edits cluster.conf after the cluster is started, the new configuration will not take effect until the cluster is restarted.

Stopping the Cluster

One cannot stop the cluster if there are any active heartbeat regions, which indicate mounted volume(s). In other words, one can stop a cluster only if there are no entries in /config/cluster//heartbeat.

To stop the cluster, do:

# /etc/init.d/o2cb offline

The o2cb init script calls o2cb_ctl which reads cluster.conf and removes the corresponding entries in /config/cluster//node. It should be noted that o2cb_ctl only removes entries listed in cluster.conf. So, if one were to manually edit the config file while the cluster was online, o2cb may not be able to clean up all the entries in /config. The cluster may even indicate that it is stopped but some modules will not unload.

To understand how to fix such an issue, let us unload the modules manually.

# lsmod | egrep "ocfs|config" ocfs2_dlmfs 25864 1 ocfs2_dlm 210056 1 ocfs2_dlmfs ocfs2_nodemanager 178384 103 ocfs2_dlmfs,ocfs2_dlm configfs 26764 2 ocfs2_nodemanager

Notice the reference counts in the 3rd column. ocfs2_dlmfs has 1. That is because dlmfs is mounted at /dlm. # mount | grep ocfs2_dlm ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw) umount /dlm manually and the reference count drops to 0.

# umount /dlm/ # lsmod | egrep "ocfs|config" ocfs2_dlmfs 25864 0 ocfs2_dlm 210056 1 ocfs2_dlmfs ocfs2_nodemanager 178384 103 ocfs2_dlmfs,ocfs2_dlm configfs 26764 2 ocfs2_nodemanager

One can unload inactive modules using rmmod.

# rmmod ocfs2_dlmfs # rmmod ocfs2_dlm # lsmod | egrep "ocfs|config" ocfs2_nodemanager 178384 101 configfs 26764 2 ocfs2_nodemanager

The node manager reference counts are due to cluster information populated in /config.

# ls /config/cluster//node/ | wc -l 100 # ls /config/cluster/ | wc -l 1

In this case, 100 nodes and 1 cluster. To remove entries by hand, do:

# rmdir /config/cluster//node/* # rmdir /config/cluster/* # lsmod | egrep "ocfs|config" ocfs2_nodemanager 178384 0 configfs 26764 2 ocfs2_nodemanager

# rmmod ocfs2_nodemanager # lsmod | egrep "ocfs|config" configfs 26764 1

# mount | grep configfs configfs on /config type configfs (rw)

Like dlmfs, configfs also needs to be umounted.

# umount /config # rmmod configfs # lsmod | egrep "ocfs|config"

If one were to edit cluster.conf while the cluster was up and then tried to shutdown the cluster, it may fail. The success or failure depends on the change made. If the user only added new nodes, then it would work. However, if the user deleted existing nodes, one would need to manually remove the entries from /config as listed above. In any of these cases, rebooting the node would also "fix" the problem.