Oracle® 8 Managing Shared File Systems

F29522-06 August 2021 Oracle Legal Notices

Copyright © 2020, 2021, Oracle and/or its affiliates.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:

U.S. GOVERNMENT END USERS: Oracle programs (including any , integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract. The terms governing the U.S. Government's use of Oracle cloud services are defined by the applicable contract for such services. No other rights are granted to the U.S. Government.

This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

Oracle and are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc, and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. is a registered trademark of The Open Group.

This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.

Abstract

Oracle® Linux 8: Managing Shared File Systems provides tasks for managing shared file systems in Oracle Linux 8.

Document generated on: 2021-08-18 (revision: 12368) Table of Contents

Preface ...... v 1 About Shared Management in Oracle Linux ...... 1 2 Managing the Network File System in Oracle Linux ...... 3 2.1 About NFS ...... 3 2.1.1 Supported Versions of NFS ...... 3 2.1.2 About NFS Services ...... 4 2.2 Configuring an NFS ...... 5 2.2.1 Configuring an NFS Server by Editing the /etc/exports File ...... 5 2.2.2 Configuring an NFS Server by Using the exportfs Command ...... 7 2.3 Mounting an NFS File System ...... 8 3 Managing the Oracle Cluster File System Version 2 in Oracle Linux ...... 11 3.1 About OCFS2 ...... 11 3.2 Maximum File System Size Requirements for OCFS2 ...... 13 3.3 Installing and Configuring an OCFS2 Cluster ...... 13 3.3.1 Preparing a Cluster for OCFS2 ...... 13 3.3.2 Configuring the for the Cluster ...... 14 3.3.3 Configuring the Cluster Software ...... 14 3.3.4 Creating the Configuration File for the Cluster Stack ...... 15 3.3.5 Configuring the Cluster Stack ...... 17 3.3.6 Configuring the Kernel for Cluster Operation ...... 19 3.3.7 Commands for Administering the Cluster Stack ...... 19 3.4 Administering OCFS2 Volumes ...... 20 3.4.1 Commands for Creating OCFS2 Volumes ...... 20 3.4.2 Suggested Cluster Size Settings ...... 21 3.4.3 Creating OCFS2 Volumes ...... 21 3.4.4 Mounting OCFS2 Volumes ...... 22 3.4.5 Querying and Changing Volume Parameters ...... 22 3.5 Creating a Local OCFS2 File System ...... 22 3.6 Troubleshooting OCFS2 Issues ...... 23 3.6.1 Recommended Debugging Tools and Practices ...... 23 3.6.2 Mounting the File System ...... 24 3.6.3 Configuring OCFS2 Tracing ...... 24 3.6.4 Debugging File System Locks ...... 25 3.6.5 Configuring the Behavior of Fenced Nodes ...... 26 3.7 OCFS2 Use Cases ...... 27 3.7.1 Load Balancing Use Case ...... 27 3.7.2 Oracle Real Application Cluster Use Case ...... 27 3.7.3 Oracle Database Use Case ...... 27 4 Managing in Oracle Linux ...... 29 4.1 About Samba ...... 29 4.1.1 About Samba Services ...... 29 4.1.2 About the Samba Configuration File ...... 30 4.2 Configuring and Using Samba ...... 31 4.2.1 Testing Samba Configuration by Using the testparm Command ...... 31 4.2.2 Configuring a Samba Stand-Alone Server ...... 31 4.2.3 Configuring a Samba Stand-Alone Server Within a Workgroup ...... 32 4.2.4 Adding a Samba Server as a Member of an AD Domain ...... 33 4.3 Accessing Samba Shares ...... 34 4.3.1 Accessing Samba Shares From a Windows ...... 34 4.3.2 Accessing Samba Shares From an Oracle Linux Client ...... 35

iii iv Preface

Oracle® Linux 8: Managing Shared File Systems provides information about managing shared file systems in Oracle Linux 8. Audience

This document is intended for administrators who need to configure and administer Oracle Linux. It is assumed that readers are familiar with web technologies and have a general understanding of using the Linux operating system, including knowledge of how to use a text editor such as emacs or vim, essential commands such as cd, , chown, ls, mkdir, mv, ps, pwd, and rm, and using the man command to view manual pages. Document Organization

The document is organized into the following chapters:

• Chapter 1, About Shared File System Management in Oracle Linux

• Chapter 2, Managing the Network File System in Oracle Linux describes tasks for the distributed Network File System (NFS), including instructions for setting up NFS servers and clients.

• Chapter 3, Managing the Oracle Cluster File System Version 2 in Oracle Linux describes how to configure and use the Oracle Cluster File System Version 2 (OCFS2) file system in Oracle Linux.

• Chapter 4, Managing Samba in Oracle Linux describes tasks for the Samba shared file system, including instructions for setting up Samba servers. Related Documents

The documentation for this product is available at:

://docs.oracle.com/en/operating-systems/oracle-linux/. Conventions

The following text conventions are used in this document:

Convention Meaning boldface Boldface type indicates graphical user interface elements associated with an action, or terms defined in text or the glossary. italic Italic type indicates book titles, emphasis, or placeholder variables for which you supply particular values. monospace Monospace type indicates commands within a paragraph, URLs, code in examples, text that appears on the screen, or text that you enter. Documentation Accessibility

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at https://www.oracle.com/corporate/accessibility/.

v Access to Oracle Support for Accessibility

Access to Oracle Support for Accessibility

Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit https://www.oracle.com/corporate/accessibility/learning-support.html#support-tab. Diversity and Inclusion

Oracle is fully committed to diversity and inclusion. Oracle respects and values having a diverse workforce that increases thought leadership and innovation. As part of our initiative to build a more inclusive culture that positively impacts our employees, customers, and partners, we are working to remove insensitive terms from our products and documentation. We are also mindful of the necessity to maintain compatibility with our customers' existing technologies and the need to ensure continuity of service as Oracle's offerings and industry standards evolve. Because of these technical constraints, our effort to remove insensitive terms is ongoing and will take time and external cooperation.

vi Chapter 1 About Shared File System Management in Oracle Linux

This chapter includes a brief description of shared files systems and includes information about the types of shared file systems that are provided in Oracle Linux 8 and how they are used. The subsequent chapters of this guide describe how to manage each of the shared file systems types that are described in this chapter.

For information about managing local file systems in Oracle Linux, see Oracle® Linux 8: Managing Local File Systems.

A shared file system is a type of file system that enables multiple users to access the same files across different operating systems or over a network at the same time. The shared file system approach provides many benefits. Most notably, using shared file systems can improve performance and scalability and greatly reduce the time that administrators spend managing data.

Oracle Linux 8 includes support for several file systems types, including the following distributed and shared file systems:

Network File System (NFS) Is a distributed file system that enables users and client systems to access files over a network, as though the files were on local storage. An NFS server can share hierarchies in its local file systems with remote client systems over an IP-based network.

For more information about NFS, see Chapter 2, Managing the Network File System in Oracle Linux.

Oracle Cluster File System Is a general-purpose, shared-disk file system that is intended for Version 2 (OCFS2) use with clusters. OCFS2 offers high performance, as well as high availability. When you use clustered file systems, all of the nodes understand the file system structure and the full file system is shared across all of the nodes.

For more information about OCFS2, see Chapter 3, Managing the Oracle Cluster File System Version 2 in Oracle Linux.

Samba Is an open-source implementation of the (SMB) protocol that provides the capability for Oracle Linux systems to interoperate with systems, as both a server and a client. You can use Samba for file sharing across different operating systems over a network.

For more information about Samba, see Chapter 4, Managing Samba in Oracle Linux.

1 2 Chapter 2 Managing the Network File System in Oracle Linux

Table of Contents

2.1 About NFS ...... 3 2.1.1 Supported Versions of NFS ...... 3 2.1.2 About NFS Services ...... 4 2.2 Configuring an NFS Server ...... 5 2.2.1 Configuring an NFS Server by Editing the /etc/exports File ...... 5 2.2.2 Configuring an NFS Server by Using the exportfs Command ...... 7 2.3 Mounting an NFS File System ...... 8

This chapter includes information about managing the Network File System (NFS) in Oracle Linux 8, including tasks for configuring, administering, and using NFS.

For information about local file system management in Oracle Linux, see Oracle® Linux 8: Managing Local File Systems. 2.1 About NFS

NFS (Network File System) is a distributed file system that enables a client system to access files over a network, as though the files were on local storage.

An NFS server can share directory hierarchies in its local file systems with remote client systems over an IP-based network. After an NFS server exports a directory, and then NFS clients mount this directory, if they have been granted the appropriate permissions. To the client systems, the directory appears as if it were a local directory. Benefits of using NFS include centralized storage provisioning, improved data consistency , and reliability. 2.1.1 Supported Versions of NFS

The following versions of NFS are supported in Oracle Linux 8:

• NFS version 3 (NFSv3), specified in RFC 1813.

• NFS version 4 (NFSv4), specified in RFC 7530.

• NFS version 4 minor version 1 (NFSv4.1), specified in RFC 5661.

• NFS version 4 minor version 2 (NFSv4.2), specified in RFC 7862 .

Note

NFSv2 is no longer supported.

NFSv3 provides safe, asynchronous writes and efficient error handling. NFSv3 also supports 64-bit file sizes and offsets, which enable clients to access more than 2 GB of file data.

NFSv3 relies on (RPC) services, which are controlled by the rpcbind service. The rpcbind service responds to requests for an RPC service and then sets up connections for the requested service. In addition, separate services are used to handle locking and mounting protocols, as configuring a firewall to cope with the various ports that are used by all these services can be complex and error-prone.

3 About NFS Services

Note

In previous Oracle Linux releases, NFSv3 was able to also use the (UDP). However, in Oracle Linux 8, NFS over UDP is no longer supported. Further, UDP is disabled in the NFS server by default in this release.

NFSv4 is capable of working through firewalls, as well as the Internet. Also, NFSv4 does not require the rpcbind service. In addition, NFSv4 supports access Control Lists (ACLs), and uses stateful operations.

NFSv4 requires the Transmission Control Protocol (TCP) running over an IP network. As mentioned, NFSv4 does not use rpcbind; as such, the NFS server listens on TCP port 2049 for service requests. The mounting and locking protocols are also integrated into the NFSv4 protocol, which means that separate services are also not required for these protocols. These refinements firewall configuration for NFSv4 no more difficult than for a service such as HTTP.

Note that in Oracle Linux 8, NFS clients attempt to mount by using NFSv4.2 (the default version), but fall back to NFSv4.1 when the server does not support NFSv4.2. The mount later falls back to NFSv4.0 and then to NFSv3. 2.1.2 About NFS Services

In Oracle Linux 8, NFS versions rely on Remote Procedure Calls (RPC) between clients and servers. To share or mount NFS file systems, the following required services work together, depending on which version of NFS is implemented. Note that all of these services are started automatically:

nfsd Is the server kernel module that services requests for shared NFS file systems.

rpcbind Is a service that accepts port reservations from local RPC services, which are made available or advertised so that the corresponding remote RPC services can access them and also hat the client is allowed to access it.

rpc.mountd Is a process that is used by an NFS server to process mount requests from NFSv3 clients. The service checks that the requested NFS share is currently exported by the NFS server.

rpc.nfsd Is a process that enables explicit NFS versions and protocols the server advertises to be defined.

lockd Is a kernel thread that runs on both clients and servers. The lockd process implements the Network Lock Manager (NLM) protocol, which enables NFSv3 clients to lock files on the server. The is started automatically whenever the NFS server is run and whenever an NFS file system is mounted.

rpc-statd Is a process that implements the Network Status Monitor (NSM) RPC protocol, which notifies NFS clients when an NFS server is restarted without being gracefully brought down. The rpc-statd service is automatically started by the nfs-server service. This service does not require configuration by the user and is not used with NFSv4.

rpc-idmapd Is a process that provides NFSv4 client and server upcalls, which map between on-the-wire NFSv4 names (strings in the form of user@domain) and local UIDs and GIDs. Note that for the idmapd process to function with NFSv4, you must configure the /etc/

4 Configuring an NFS Server

idmapd.conf file. Note that only NFSv4 uses the rpc-idmapd process.

Note

The mounting and locking protocols are incorporated into the NFSv4 protocol. Also, the server listens on TCP port 2049. For this reason, NFSv4 does not need to interact with the rpcbind, lockd, and rpc-statd services. However, the nfs- mountd service is still required to set up exports on the NFS server; but, the service is not involved in any over-the-wire operations.

The rpc-idmapd service only handles upcalls from the kernel and is not itself directly involved in any over-the-wire operations. The service, however, might make naming service calls, which do result in over-the-wire lookups. 2.2 Configuring an NFS Server

There are two ways in which you can configure an NFS server in Oracle Linux 8:

• By editing the /etc/exports file manually.

Exports can also be added to files that you create in the /etc/exports.d directory.

• By using the exportfs command.

The following procedures describe both of these methods. 2.2.1 Configuring an NFS Server by Editing the /etc/exports File

The following steps describe how to configure an NFS server by editing the /etc/exports file.

Note

You can also add exports to files that you create in the /etc/exports.d directory in a similar fashion.

1. Check that the nfs-utils package is installed on the system. If necessary, install the package as follows:

sudo dnf install nfs-utils

2. Edit the /etc/exports file to define the directories that the server will make available for clients to mount, for example:

/var/folder 192.0.2.102(rw,async) /usr/local/apps *(all_squash,anonuid=501,anongid=501,ro) /var/projects/proj1 192.168.1.0/24(ro) mgmtpc(rw)

Each entry includes the local path to the exported directory, followed by a list of clients that can mount the directory, with client-specific exports options in parentheses.

The information in the previous example is as follows:

• The client system with the IP address 192.0.2.102 can mount the /var/folder directory with read and write permissions. All writes to the disk are asynchronous, which means that the server does not wait for write requests to be written to disk before responding to further requests from the client.

5 Configuring an NFS Server by Editing the /etc/exports File

Note

No other clients are allowed to mount the /var/folder directory.

• All of the clients can mount the /usr/local/apps directory as read-only. All connecting users, including root users, are mapped to the local, unprivileged user with UID 501 and GID 501.

• All of the clients on the 192.168.1.0/24 subnet can mount the /var/projects/proj1 directory as read-only and the client system named mgmtpc can mount the directory with read-write permissions.

Note

There is no space between a client specifier and the parenthesized list of options that apply to that client.

For more information, see the exports(5) manual page.

3. If the server will serve NFSv4 clients, edit the /etc/idmapd.conf file's definition for the Domain parameter to specify the DNS domain name of the server:

Domain = mydom.com

This setting prevents the owner and group from being unexpectedly listed as the anonymous user or group (nobody or nogroup) on NFS clients when the all_squash mount option has not been specified.

4. If you need to enable access through the firewall for NFSv4 clients only, use the following commands:

sudo firewall-cmd --zone=zone --add-service=nfs

sudo firewall-cmd --permanent --zone=zone --add-service=nfs

This configuration assumes that rpc.nfsd listens for client requests on TCP port 2049, which it does by default.

5. If you need to enable access through the firewall for NFSv3 and NFSv4 clients, do the following:

a. Edit the /etc/nfs.conf file to create port settings for handling network mount requests and status monitoring, as well as set the TCP port on which the network lock manager should listen, for example:

# Ports that various services should listen on.

[mountd] port = 892

[statd] port = 662

[lockd] port = 32803

If any port is in use, use the lsof -i command to locate an unused port and then amend the setting in the /etc/nfs.conf file, as appropriate.

b. Restart the firewall service and configure the firewall to allow NFSv3 connections:

sudo firewall-cmd --permanent --zone=zone \

6 Configuring an NFS Server by Using the exportfs Command

--add-port=2049/tcp --add-port=111/tcp --add-port=32803/tcp --add-port=892/tcp --add-port=662/tcp

c. Shut down and then reboot the server.

sudo systemctl reboot

Note that NFS fails to start if one of the specified ports is in use and reports an error in /var/log/ messages. Edit the /etc/nfs.conf file to use a different port number for the service that was unable to start, and then attempt to restart the nfs-server service. You can use the rpcinfo -p command to confirm on which ports RPC services are listening.

6. Start the nfs-server service and configure the service to start following a system reboot:

sudo systemctl enable --now nfs-server

7. Display a list of the exported file systems by running the showmount -e command, for example:

sudo showmount -e

Export list for host01.mydom.com /var/folder 192.0.2.102 /usr/local/apps * /var/projects/proj1 192.168.1.0/24 mgmtpc

You can also use the exportfs command on the server to display this information, for example:

sudo /usr/sbin/exportfs -v

Use the showmount -a command to display all of the current clients, as well as all of the file systems that they have mounted:

sudo showmount -a

Note

To enable use of the showmount command from NFSv4 clients, MOUNTD_PORT must be defined in the /etc/nfs.conf file and a firewall rule must enable access on this TCP port. 2.2.2 Configuring an NFS Server by Using the exportfs Command

You can also export or unexport directories by using the exportfs command. Using the exportfs command enables the root user to export or unexport directories selectively, and eliminates the need to restart the NFS service. By providing the appropriate options, the exportfs command writes the exported file systems to the /var/lib/nfs/etab file. Changes to the list of exported file systems are effective immediately because the nfs-mountd service refers to the etab file for determining access privileges to a file system.

The following are some of the options that you can specify with the exportfs command. Note that using the exportfs without any options displays a list of currently exported file systems:

-r Enables all of the directories listed in the /etc/exports file that are to be exported by constructing a new export list in the /var/lib/nfs/ etab file. The -r option refreshes the export list with changes that are made to the /etc/exports file.

7 Mounting an NFS File System

-a Enables all directories to be exported or unexported, which is dependent on other options that are passed to the exportfs command. If no other options are specified, the command exports all of the file systems that are specified in the /etc/exports file.

-u Unexports all of the shared directories.

Note

The exportfs -ua command suspends NFS file sharing, but keeps all NFS services running. To re-enable NFS sharing, use the exportfs - r command.

-v Specifies a verbose operation of the exportfs command, which displays information about the file systems that are being exported or unexported in greater detail.

For more information, see the exportfs(8), exports(5), and showmount(8) manual pages. 2.3 Mounting an NFS File System

To mount an NFS file system on an Oracle Linux 8 client:

1. Check whether the nfs-utils package is installed on the system. If necessary, install the package as follows:

sudo dnf install nfs-utils

2. Use the showmount -e command to display the file systems that the NFS server exports, for example:

sudo showmount -e host01.mydom.com

The output of the previous command would be similar to the following:

Export list for host01.mydom.com /var/folder 192.0.2.102 /usr/local/apps * /var/projects/proj1 192.168.1.0/24 mgmtpc

Note

Be aware that some servers do not allow querying of this information, but the server may still be exporting NFS file systems.

3. Use the mount command to mount an exported NFS file system on an available mount point:

sudo mount -r -o nosuid host01.mydoc.com:/usr/local/apps /apps

Note that in most cases, when mounting an NFS file system, the -t nfs option can be omitted.

This example mounts the /usr/local/apps directory that is exported by host01.mydoc.com with read-only permissions on /apps. The nosuid option prevents remote users from gaining higher privileges by running a setuid program.

4. To configure the system to mount an NFS file system at boot time, add an entry for the file system to the /etc/fstab file, as shown in the following example:

8 Mounting an NFS File System

host01.mydoc.com:/usr/local/apps /apps nfs ro,nosuid 0 0

For more information, see the mount(8), nfs(5), and showmount(8) manual pages.

9 10 Chapter 3 Managing the Oracle Cluster File System Version 2 in Oracle Linux

Table of Contents

3.1 About OCFS2 ...... 11 3.2 Maximum File System Size Requirements for OCFS2 ...... 13 3.3 Installing and Configuring an OCFS2 Cluster ...... 13 3.3.1 Preparing a Cluster for OCFS2 ...... 13 3.3.2 Configuring the Firewall for the Cluster ...... 14 3.3.3 Configuring the Cluster Software ...... 14 3.3.4 Creating the Configuration File for the Cluster Stack ...... 15 3.3.5 Configuring the Cluster Stack ...... 17 3.3.6 Configuring the Kernel for Cluster Operation ...... 19 3.3.7 Commands for Administering the Cluster Stack ...... 19 3.4 Administering OCFS2 Volumes ...... 20 3.4.1 Commands for Creating OCFS2 Volumes ...... 20 3.4.2 Suggested Cluster Size Settings ...... 21 3.4.3 Creating OCFS2 Volumes ...... 21 3.4.4 Mounting OCFS2 Volumes ...... 22 3.4.5 Querying and Changing Volume Parameters ...... 22 3.5 Creating a Local OCFS2 File System ...... 22 3.6 Troubleshooting OCFS2 Issues ...... 23 3.6.1 Recommended Debugging Tools and Practices ...... 23 3.6.2 Mounting the debugfs File System ...... 24 3.6.3 Configuring OCFS2 Tracing ...... 24 3.6.4 Debugging File System Locks ...... 25 3.6.5 Configuring the Behavior of Fenced Nodes ...... 26 3.7 OCFS2 Use Cases ...... 27 3.7.1 Load Balancing Use Case ...... 27 3.7.2 Oracle Real Application Cluster Use Case ...... 27 3.7.3 Oracle Database Use Case ...... 27

This chapter includes information about managing the Oracle Cluster File System Version 2 (OCFS2) in Oracle Linux 8. The chapter includes tasks for configuring, administering, and troubleshooting OCFS2.

Note

In Oracle Linux 8, the OCFS2 file system type is supported on Unbreakable Enterprise Kernel (UEK) releases only, starting with Unbreakable Enterprise Kernel Release 6 (UEK R6).

For information about local file system management in Oracle Linux, see Oracle® Linux 8: Managing Local File Systems. 3.1 About OCFS2

OCFS2 (Oracle Cluster File System Version 2) is a general-purpose shared-disk file system that is intended for use with clusters. OCFS2 offers high performance, as well as high availability. It is also possible to mount an OCFS2 volume on a standalone, non-clustered system.

11 About OCFS2

Although it might appear that the ability to mount an file system locally has no benefits, when compared to alternative file systems such as or , you can use the reflink command with OCFS2 to create copy-on-write clones of individual files. You can also use the cp --reflink command in a similar way that you would on a Btrfs file system. Typically, such clones enable you to save disk space when storing multiple copies of very similar files, such as virtual machine (VM) images or Linux Containers. In addition, mounting a local OCFS2 file system enables you to subsequently migrate it to a cluster file system without requiring any conversion. Note that when using the reflink command, the resulting file system behaves like a clone of the original files ystem, which means that their UUIDs are identical. When using the reflink command to create a clone, you must change the UUID by using the tunefs.ocfs2 command. See Section 3.4.5, “Querying and Changing Volume Parameters”.

Almost all applications can use OCFS2 as it provides local file-system semantics. Applications that are cluster-aware can use cache-coherent parallel I/O from multiple cluster nodes to balance activity across the cluster, or they can use of the available file-system functionality to fail over and run on another node in the event that a node fails.

The following are examples of some typical use cases for OCFS2:

• Oracle VM to host shared access to virtual machine images.

• Oracle VM and VirtualBox to enable Linux guest machines to share a file system.

• Oracle Real Application Cluster (RAC) in database clusters.

• Oracle E-Business Suite in middleware clusters.

The following OCFS2 features that make it a suitable choice for deployment in an enterprise-level computing environment:

• Support for ordered and write-back data journaling that provides file system consistency in the event of power failure or system crash.

• Block sizes ranging from 512 bytes to 4 KB, and file-system cluster sizes ranging from 4 KB to 1 MB (both in increments of powers of 2). The maximum supported volume size is 16 TB, which corresponds to a cluster size of 4 KB. A volume size as large as 4 PB is theoretically possible for a cluster size of 1 MB, although this limit has not been tested.

• Inclusion of nowait support for OCFS2 in Unbreakable Enterprise Kernel Release 6 (UEK R6).

When the nowait flag is specified, -EAGAIN is returned if the following checks fail for direct I/O: cannot get related locks immediately or blocks are not allocated at the write location, which triggers block allocation and subsequently blocks I/O operations.

-based allocations for efficient storage of very large files.

• Optimized allocation support for sparse files, inline-data, unwritten extents, hole punching, reflinks, and allocation reservation for high performance and efficient storage.

• Indexing of directories to allow efficient access to a directory even if it contains millions of objects.

checksums for the detection of corrupted inodes and directories.

• Extended attributes to allow an unlimited number of name:value pairs to be attached to file system objects such as regular files, directories, and symbolic links.

• Advanced security support for POSIX ACLs and SELinux in addition to the traditional file-access permission model.

12 Maximum File System Size Requirements for OCFS2

• Support for user and group quotas.

• Support for heterogeneous clusters of nodes with a mixture of 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64), architectures.

• An easy-to-configure, in-kernel cluster-stack (O2CB) with the Linux Distributed Lock Manager (DLM) for managing concurrent access from the cluster nodes.

• Support for buffered, direct, asynchronous, splice, and memory-mapped I/O.

• A toolset that uses similar parameters as the file system.

For more information about OCFS2, visit https://oss.oracle.com/projects/ocfs2/documentation/. 3.2 Maximum File System Size Requirements for OCFS2

Starting with the Oracle Linux 8.2 release, the OCFS2 file system is supported on systems that are running the Unbreakable Enterprise Kernel Release 6 (UEK R6) kernel.

The maximum file size and maximum file system size requirements for OCFS2 are as follows:

• Maximum file size:

4 PiB

• Maximum file system size:

4 PiB 3.3 Installing and Configuring an OCFS2 Cluster

The following procedures describe how to set up a cluster to use OCFS2. 3.3.1 Preparing a Cluster for OCFS2

For the best performance, each node in the cluster should have at least two network interfaces. The first interface is connected to a public network to allow general access to the systems, while the second interface is used for private communications between the nodes and the cluster heartbeat. This second interface determines how the cluster nodes coordinate their access to shared resources and how they monitor each other's state.

Note

Both network interfaces must be connected through a network switch. Additionally, you must ensure that all of the network interfaces are configured and working before configuring the cluster.

You can choose from the following two cluster heartbeat configurations:

• Local heartbeat thread for each shared device (default heartbeat mode).

In this mode, a node starts a heartbeat thread when it mounts an OCFS2 volume and stops the thread when it unmounts the volume. There is a large CPU overhead on nodes that mount a large number of OCFS2 volumes as each mount requires a separate heartbeat thread. Note that a large number of mounts also increases the risk of a node fencing itself out of the cluster due to a heartbeat I/O timeout on a single mount.

• Global heartbeat on specific shared devices.

13 Configuring the Firewall for the Cluster

This mode enables you to configure any OCFS2 volume as a global heartbeat device, provided that it occupies a whole disk device and not a partition. In this mode, the heartbeat to the device starts when the cluster comes online and stops when the cluster goes offline. This mode is recommended for clusters that mount a large number of OCFS2 volumes. A node fences itself out of the cluster if a heartbeat I/O timeout occurs on more than half of the global heartbeat devices. To provide redundancy against failure of one of the devices, you should configure at least three global heartbeat devices.

The following figure shows a cluster of four nodes that are connected by using a network switch to a LAN and a network storage server. The nodes and storage server are also connected by using a switch to a private network that is used for the local cluster heartbeat.

Figure 3.1 Cluster Configuration by Using a Private Network

Although it is possible to configure and use OCFS2 without using a private network, note that such a configuration increases the probability of a node fencing itself out of the cluster due to an I/O heartbeat timeout. 3.3.2 Configuring the Firewall for the Cluster

Configure or disable the firewall on each node to allow access on the interface that the cluster will use for private cluster communication. By default, the cluster uses both TCP and UDP over port 7777.

For example, to allow incoming TCP connections and UDP datagrams on port 7777, you would use the following command:

sudo firewall-cmd --zone=zone --add-port=7777/tcp --add-port=7777/udp

sudo firewall-cmd --permanent --zone=zone --add-port=7777/tcp --add-port=7777/udp 3.3.3 Configuring the Cluster Software

Ideally, each node should be running the same version of OCFS2 software and a compatible UEK release. It is possible for a cluster to run with mixed versions of the OCFS2 and UEK software; for example, while you are performing a rolling update of a cluster. The cluster node that is running the lowest version of the software determines the set of usable features.

Use the dnf command to install or upgrade the following packages to the same version on each node:

14 Creating the Configuration File for the Cluster Stack

• kernel-uek

• ocfs2-tools

Note

If you want to use the global heartbeat feature, you need to install the ocfs2- tools-1.8.0-11 or later package. 3.3.4 Creating the Configuration File for the Cluster Stack

You create the configuration file by using the o2cb command or by using a text editor.

To configure the cluster stack by using the o2cb command:

1. Create a cluster definition by using the following command:

sudo o2cb add-cluster cluster_name

For example, you would define a cluster named mycluster with four nodes as follows:

sudo o2cb add-cluster mycluster

The previous command creates the /etc/ocfs2/cluster.conf configuration file, if it does not already exist.

2. For each node, define the node as follows:

sudo o2cb add-node cluster_name node_name --ip ip_address

The name of the node must be same as the value of system's HOSTNAME that is configured in the / etc/sysconfig/network file. The IP address will be used by the node for private communication in the cluster.

For example, you would use the following command to define a node named node0, with the IP address 10.1.0.100, in the cluster mycluster:

sudo o2cb add-node mycluster node0 --ip 10.1.0.100

Note

OCFS2 only supports IPv4 addresses.

3. If you want the cluster to use global heartbeat devices, run the following commands:

sudo o2cb add-heartbeat cluster_name device1 . . . sudo o2cb heartbeat-mode cluster_name global

Important

You must configure the global heartbeat feature to use whole disk devices. You cannot configure a global heartbeat device on a disk partition.

For example, you would use /dev/sdd, /dev/sdg, and /dev/sdj as global heartbeat devices by using the following commands:

sudo o2cb add-heartbeat mycluster /dev/sdd

15 Creating the Configuration File for the Cluster Stack

sudo o2cb add-heartbeat mycluster /dev/sdg

sudo o2cb add-heartbeat mycluster /dev/sdj

sudo o2cb heartbeat-mode mycluster global

4. Copy the cluster /etc/ocfs2/cluster.conf file to each node in the cluster.

5. Restart the cluster stack for the changes you made to the cluster configuration file to take effect.

The following is a typical example of the /etc/ocfs2/cluster.conf file. This particular configuration defines a 4-node cluster named mycluster, with a local heartbeat:

node: name = node0 cluster = mycluster number = 0 ip_address = 10.1.0.100 ip_port = 7777

node: name = node1 cluster = mycluster number = 1 ip_address = 10.1.0.101 ip_port = 7777

node: name = node2 cluster = mycluster number = 2 ip_address = 10.1.0.102 ip_port = 7777

node: name = node3 cluster = mycluster number = 3 ip_address = 10.1.0.103 ip_port = 7777

cluster: name = mycluster heartbeat_mode = local node_count = 4

If you configure your cluster to use a global heartbeat, the file also include entries for the global heartbeat devices, as shown in the following example:

node: name = node0 cluster = mycluster number = 0 ip_address = 10.1.0.100 ip_port = 7777

node: name = node1 cluster = mycluster number = 1 ip_address = 10.1.0.101 ip_port = 7777

node: name = node2

16 Configuring the Cluster Stack

cluster = mycluster number = 2 ip_address = 10.1.0.102 ip_port = 7777

node: name = node3 cluster = mycluster number = 3 ip_address = 10.1.0.103 ip_port = 7777

cluster: name = mycluster heartbeat_mode = global node_count = 4

heartbeat: cluster = mycluster region = 7DA5015346C245E6A41AA85E2E7EA3CF

heartbeat: cluster = mycluster region = 4F9FBB0D9B6341729F21A8891B9A05BD

heartbeat: cluster = mycluster region = B423C7EEE9FC426790FC411972C91CC3

The cluster heartbeat mode is now shown as global and the heartbeat regions are represented by the UUIDs of their block devices.

If you edit the configuration file manually, ensure that you use the following layout:

• The cluster:, heartbeat:, and node: headings must start in the first column.

• Each parameter entry must be indented by one tab space.

• A blank line must separate each section that defines the cluster, a heartbeat device, or a node. 3.3.5 Configuring the Cluster Stack

When configuring a cluster stack, there are several values for which you are prompted. Refer to the following table for the values that need to provide.

Prompt Description Load O2CB driver on boot (y/n) Specify whether the cluster stack driver should be loaded at boot time. The default response is n. Cluster stack backing O2CB Name of the cluster stack service. The default and usual response is o2cb. Cluster to start at boot (Enter Enter the name of your cluster that you defined in the cluster "none" to clear) configuration file, /etc/ocfs2/cluster.conf. Specify heartbeat dead threshold Number of 2-second heartbeats that must elapse without (>=7) response before a node is considered dead. To calculate the value to enter, divide the required threshold time period by 2 and then add 1. For example, to set the threshold time period to 120 seconds, enter a value of 61. The default value is 31, which corresponds to a threshold time period of 60 seconds.

17 Configuring the Cluster Stack

Prompt Description Note

If your system uses multipathed storage, the recommended value is 61 or greater. Specify network idle timeout in Time in milliseconds that must elapse before a network ms (>=5000) connection is considered dead. The default value is 30,000 milliseconds.

Note

For bonded network interfaces, the recommended value is 30,000 milliseconds or greater. Specify network keepalive delay Maximum delay in milliseconds between sending keepalive in ms (>=1000) packets to another node. The default and recommended value is 2,000 milliseconds. Specify network reconnect delay Minimum delay in milliseconds between reconnection in ms (>=2000) attempts if a network connection goes down. The default and recommended value is 2,000 milliseconds.

Follow these steps to configure the cluster stack:

1. On each node of the cluster, run the following command:

sudo /sbin/o2cb.init configure

Verify the settings for the cluster stack.

sudo /sbin/o2cb.init status

Driver for "": Loaded Filesystem "configfs": Mounted Stack glue driver: Loaded Stack plugin "o2cb": Loaded Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster "mycluster": Online Heartbeat dead threshold: 61 Network idle timeout: 30000 Network keepalive delay: 2000 Network reconnect delay: 2000 Heartbeat mode: Local Checking O2CB heartbeat: Active

In the previous example output, the cluster is online and is using the local heartbeat mode. If no volumes have been configured, the O2CB heartbeat is shown as Not active, rather than Active.

The following is the command output for an online cluster that is using three global heartbeat devices:

sudo /sbin/o2cb.init status

Driver for "configfs": Loaded Filesystem "configfs": Mounted Stack glue driver: Loaded Stack plugin "o2cb": Loaded Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted

18 Configuring the Kernel for Cluster Operation

Checking O2CB cluster "mycluster": Online Heartbeat dead threshold: 61 Network idle timeout: 30000 Network keepalive delay: 2000 Network reconnect delay: 2000 Heartbeat mode: Global Checking O2CB heartbeat: Active 7DA5015346C245E6A41AA85E2E7EA3CF /dev/sdd 4F9FBB0D9B6341729F21A8891B9A05BD /dev/sdg B423C7EEE9FC426790FC411972C91CC3 /dev/sdj

2. Configure the o2cb and ocfs2 services so that they start at boot time after networking is enabled.

sudo systemctl enable o2cb

sudo systemctl enable ocfs2

These settings enable the node to mount OCFS2 volumes automatically when the system starts. 3.3.6 Configuring the Kernel for Cluster Operation

To ensure the correct operation of a cluster, you must configure the required kernel settings, as described in the following table.

Kernel Setting Description panic STOPPED HERE. Specifies the number of seconds after a panic before a system automatically resets itself.

If the value is 0, the system hangs, which allows you to collect detailed information about the panic for troubleshooting. This is the default value.

To enable automatic reset, set a non-zero value. If you require a memory image (vmcore), allow enough time for Kdump to create this image. The suggested value is 30 seconds, although large systems will require a longer time. panic_on_oops Specifies that a system must panic if a kernel oops occurs. If a kernel thread required for cluster operation crashes, the system must reset itself. Otherwise, another node might not be able to tell whether a node is slow to respond or unable to respond, causing cluster operations to hang.

1. On each node, set the recommended values for panic and panic_on_oops, for example:

sudo kernel.panic=30

sudo sysctl kernel.panic_on_oops=1

2. To make the change persist across reboots, add the following entries to the /etc/sysctl.conf file:

# Define panic and panic_on_oops for cluster operation kernel.panic=30 kernel.panic_on_oops=1 3.3.7 Commands for Administering the Cluster Stack

There are several commands that you can use to administer the cluster stack. The following table describes the commands for performing various operations on the cluster stack.

Command Description /sbin/o2cb.init status Check the status of the cluster stack.

19 Administering OCFS2 Volumes

Command Description /sbin/o2cb.init online Start the cluster stack. /sbin/o2cb.init offline Stop the cluster stack. /sbin/o2cb.init unload Unload the cluster stack. 3.4 Administering OCFS2 Volumes

The following tasks describe how to administer OCFS2 volumes. 3.4.1 Commands for Creating OCFS2 Volumes

You use the mkfs.ocfs2 command to create an OCFS2 volume on a device. If you want to label the volume and mount it by specifying the label, the device must correspond to a partition. You cannot mount an unpartitioned disk device by specifying a label.

The following table describes some useful options that you can use when creating an OCFS2 volume.

Command Option Description -b block-size Specifies the unit size for I/O transactions to and from the file system, and the size of inode and extent blocks. The supported block sizes are --block-size block-size 512 (512 bytes), 1K, 2K, and 4K. The default and recommended block size is 4K (4 kilobytes). -C cluster-size Specifies the unit size for space used to allocate file data. The supported cluster sizes are 4K, 8K, 16K, 32K, 64K, 128K, 256K, 512K, --cluster-size cluster- and 1M (1 megabyte). The default cluster size is 4K (4 kilobytes). size --fs-feature- Enables you select a set of file-system features: level=feature-level default Enables support for the sparse files, unwritten extents, and inline data features.

max-compat Enables only those features that are understood by older versions of OCFS2.

max-features Enables all features that OCFS2 currently supports. --fs_features=feature Enables you to enable or disable individual features such as support for sparse files, unwritten extents, and backup superblocks. For more information, see the mkfs.ocfs2(8) manual page. -J size=journal-size Specifies the size of the write-ahead journal. If not specified, the size is determined from the file system usage type that you specify to the - --journal-options T option, and, otherwise, from the volume size. The default size of the size=journal-size journal is 64M (64 MB) for datafiles, 256M (256 MB) for mail, and 128M (128 MB) for vmstore. -L volume-label Specifies a descriptive name for the volume that allows you to identify it easily on different cluster nodes. --label volume-label -N number Determines the maximum number of nodes that can concurrently access a volume, which is limited by the number of node slots for

20 Suggested Cluster Size Settings

Command Option Description --node-slots number system files such as the file-system journal. For best performance, set the number of node slots to at least twice the number of nodes. If you subsequently increase the number of node slots, performance can suffer because the journal will no longer be contiguously laid out on the outer edge of the disk platter. -T file-system-usage-type Specifies the type of usage for the file system:

datafiles Database files are typically few in number, fully allocated, and relatively large. Such files require few metadata changes, and do not benefit from having a large journal.

mail Mail server files are typically many in number, and relatively small. Such files require many metadata changes, and benefit from having a large journal.

vmstore Virtual machine image files are typically few in number, sparsely allocated, and relatively large. Such files require a moderate number of metadata changes and a medium sized journal. 3.4.2 Suggested Cluster Size Settings

The following table provides suggested recommendations for minimum cluster size settings for different file system size ranges.

File System Size Suggested Minimum Cluster Size 1 GB - 10 GB 8K 10GB - 100 GB 16K 100 GB - 1 TB 32K 1 TB - 10 TB 64K 10 TB - 16 TB 128K 3.4.3 Creating OCFS2 Volumes

When creating OCFS2 volumes, keep the following additional points in mind:

• Do not create an OCFS2 volume on an LVM logical volume, as LVM is not cluster-aware.

• You cannot change the block and cluster size of an OCFS2 volume after you have created it. You can use the tunefs.ocfs2 command to modify other settings for the file system, with certain restrictions. For more information, see the tunefs.ocfs2(8) manual page.

• If you intend that the volume store database files, do not specify a cluster size that is smaller than the block size of the database.

• The default cluster size of 4 KB is not suitable if the file system is larger than a few gigabytes.

21 Mounting OCFS2 Volumes

The following examples show some of the ways in which you can create an OCFS2 volume.

Create an OCFS2 volume on /dev/sdc1 labeled myvol using all of the default settings for generic usage on file systems that are no larger than a few gigabytes. The default values are a 4 KB block and cluster size, eight node slots, a 256 MB journal, and support for default file-system features:

sudo mkfs.ocfs2 -L "myvol" /dev/sdc1

Create an OCFS2 volume on /dev/sdd2 labeled as dbvol for use with database files. In this case, the cluster size is set to 128 KB and the journal size to 32 MB.

sudo mkfs.ocfs2 -L "dbvol" -T datafiles /dev/sdd2

Create an OCFS2 volume on /dev/sde1, with a 16 KB cluster size, a 128 MB journal, 16 node slots, and support enabled for all features except refcount trees.

sudo mkfs.ocfs2 -C 16K -J size=128M -N 16 --fs-feature-level=max-features \ --fs-features=norefcount /dev/sde1 3.4.4 Mounting OCFS2 Volumes

Specify the _netdev option in the /etc/fstab file if you want the system to mount an OCFS2 volume at boot time after networking is started and unmount the file system before networking is stopped, as shown in the following example:

myocfs2vol /dbvol1 ocfs2 _netdev,defaults 0 0

Note

For the file system to mount, you must enable the o2cb and ocfs2 services to start after networking is started. See Section 3.3.5, “Configuring the Cluster Stack”. 3.4.5 Querying and Changing Volume Parameters

Use the tunefs.ocfs2 command to query or change volume parameters.

For example, to find out the label, UUID, and number of node slots for a volume, you would use the following command:

sudo tunefs.ocfs2 -Q "Label = %V\nUUID = %U\nNumSlots =%N\n" /dev/sdb

Label = myvol UUID = CBB8D5E0C169497C8B52A0FD555C7A3E NumSlots = 4

You would generate a new UUID for a volume by using the following commands:

sudo tunefs.ocfs2 -U /dev/sda

sudo tunefs.ocfs2 -Q "Label = %V\nUUID = %U\nNumSlots =%N\n" /dev/sdb

Label = myvol UUID = 48E56A2BBAB34A9EB1BE832B3C36AB5C NumSlots = 4 3.5 Creating a Local OCFS2 File System

Note

The OCFS2 file system type is supported on the Unbreakable Enterprise Kernel (UEK) release only.

22 Troubleshooting OCFS2 Issues

The following procedure describes how to create an OCFS2 file system to be mounted locally, which is not associated with a cluster.

To create an OCFS2 file system that is to be mounted locally, use the following command syntax:

sudo mkfs.ocfs2 -M local --fs-features=local -N 1 [options] device

For example, you would create a locally mountable OCFS2 volume on /dev/sdc1, with one node slot and the label localvol, as follows:

sudo mkfs.ocfs2 -M local --fs-features=local -N 1 -L "localvol" /dev/sdc1

You can use the tunefs.ocfs2 utility to convert a local OCTFS2 file system to cluster use, as follows:

sudo umount /dev/sdc1

sudo tunefs.ocfs2 -M cluster --fs-features=clusterinfo -N 8 /dev/sdc1

The previous example also increases the number of node slots from 1 to 8, to allow up to eight nodes to mount the file system. 3.6 Troubleshooting OCFS2 Issues

Refer to the following information when investigating how to resolve issues that you might encounter when administering OCFS2. 3.6.1 Recommended Debugging Tools and Practices

You can use the following tools to troubleshoot OCFS2 issues:

• It is recommended that you set up netconsole on the nodes to capture an oops trace.

• You can use the tcpdump command to capture the DLM's network traffic between nodes. For example, to capture TCP traffic on port 7777 for the private network interface em2, you could use the following command:

sudo tcpdump -i em2 -C 10 -W 15 -s 10000 -Sw /tmp/`hostname -s`_tcpdump.log \ -ttt 'port 7777' &

• The debugfs.ocfs2 command tarcesevents in the OCFS2 driver, determine lock statuses, walk directory structures, examine inodes, and so on. This command is similar in behavior to the debugfs command that is used for the ext3 file system.

For more information, see the debugfs.ocfs2(8) manual page.

• Use the o2image command to save an OCFS2 file system's metadata, including information about inodes, file names, and directory names, to an image file on another file system. Because the image file contains only metadata, it is much smaller than the original file system. You can use the debugfs.ocfs2 command to open the image file and analyze the file system layout to determine the cause of a file system corruption or performance problem.

For example, to create the image /tmp/sda2.img from the OCFS2 file system on the device /dev/ sda2, you would use the following command:

sudo o2image /dev/sda2 /tmp/sda2.img

For more information, see the o2image(8) manual page.

23 Mounting the debugfs File System

3.6.2 Mounting the debugfs File System

OCFS2 uses the debugfs file system to enable userspace access to information about its in-kernel state. Note that you must mount the debugfs file system to use the debugfs.ocfs2 command.

For example, to mount the debugfs file system, add the following line to the /etc/fstab file:

debugfs /sys/kernel/debug debugfs defaults 0 0

Then, run the mount -a command. 3.6.3 Configuring OCFS2 Tracing

You can use the following commands and methods to trace issues in OCFS2.

3.6.3.1 Commands for Tracing OCFS2 Issues

The following table describes several commands that are useful for tracing issues.

Command Description debugfs.ocfs2 -l List all of the trace bits and their statuses. debugfs.ocfs2 -l SUPER allow Enable tracing for the superblock. debugfs.ocfs2 -l SUPER off Disable tracing for the superblock. debugfs.ocfs2 -l SUPER deny Disallow tracing for the superblock, even if it is implicitly enabled by another tracing mode setting. debugfs.ocfs2 -l HEARTBEAT \ Enable heartbeat tracing.

ENTRY EXIT allow debugfs.ocfs2 -l HEARTBEAT off \ Disable heartbeat tracing. Note that the ENTRY and EXIT parameters are set to deny, as they exist in ENTRY EXIT deny all trace paths. debugfs.ocfs2 -l ENTRY EXIT \ Enable tracing for the file system.

NAMEI INODE allow debugfs.ocfs2 -l ENTRY EXIT \ Disable tracing for the file system.

deny NAMEI INODE allow debugfs.ocfs2 -l ENTRY EXIT \ Enable tracing for the DLM.

DLM DLM_THREAD allow debugfs.ocfs2 -l ENTRY EXIT \ Disable tracing for the DLM.

deny DLM DLM_THREAD allow

3.6.3.2 OCFS2 Tracing Methods and Examples

One method that you can use to obtain a trace is to first enable the trace, sleep for a short while, and then disable the trace. As shown in the following example, to avoid unnecessary output, you should reset the trace bits to their default settings after you have finished tracing:

sudo debugfs.ocfs2 -l ENTRY EXIT NAMEI INODE allow && sleep 10 && \ debugfs.ocfs2 -l ENTRY EXIT deny NAMEI INODE off

24 Debugging File System Locks

To limit the amount of information that is displayed, enable only the trace bits that are relevant to diagnosing the problem.

If a specific file system command, such as mv, is causing an error, you might use commands that are shown in the following example to trace the error:

sudo debugfs.ocfs2 -l ENTRY EXIT NAMEI INODE allow

mv source destination & CMD_PID=$(jobs -p %-)

echo $CMD_PID

sudo debugfs.ocfs2 -l ENTRY EXIT deny NAMEI INODE off

Because the trace is enabled for all mounted OCFS2 volumes, knowing the correct process ID can help you to interpret the trace.

For more information, see the debugfs.ocfs2(8) manual page. 3.6.4 Debugging File System Locks

If an OCFS2 volume hangs, you can use the following procedure to determine which locks are busy and which processes are likely to be holding the locks.

In the following procedure, the Lockres value refers to the lock name that is used by DLM, which is a combination of a lock-type identifier, inode number, and a generation number. The following table lists the various lock types and their associated identifier.

Table 3.1 DLM Lock Types

Identifier Lock Type D File data M Metadata R Rename S Superblock W Read-write

1. Mount the debug file system.

sudo mount -t debugfs debugfs /sys/kernel/debug

2. Dump the lock statuses for the file system device, which is /dev/sdx1 in the following example:

echo "fs_locks" | debugfs.ocfs2 /dev/sdx1 >/tmp/fslocks 62 Lockres: M00000000000006672078b84822 Mode: Protected Read Flags: Initialized Attached RO Holders: 0 EX Holders: 0 Pending Action: None Pending Unlock Action: None Requested Mode: Protected Read Blocking Mode: Invalid

3. Use the Lockres value from the output in the previous step to obtain the inode number and generation number for the lock.

echo "stat " | debugfs.ocfs2 -n /dev/sdx1

Inode: 419616 Mode: 0666 Generation: 2025343010 (0x78b84822) ...

25 Configuring the Behavior of Fenced Nodes

4. Determine the file system object to which the inode number relates; for example, running the following command displays the output that follows:

sudo echo "locate <419616>" | debugfs.ocfs2 -n /dev/sdx1

419616 /linux-2.6.15/arch/i386/kernel/semaphore.c

5. Obtain the lock names that are associated with the file system object.

echo "encode /linux-2.6.15/arch/i386/kernel/semaphore.c" | debugfs.ocfs2 -n /dev/sdx1

M00000000000006672078b84822 D00000000000006672078b84822 W00000000000006672078b84822

In the previous example, a metadata lock, a file data lock, and a read-write lock are associated with the file system object.

6. Determine the DLM domain of the file system by running the following command:

echo "stats" | debugfs.ocfs2 -n /dev/sdX1 | grep UUID: | while read a b ; do echo $b ; done

82DA8137A49A47E4B187F74E09FBBB4B

7. Using the values of the DLM domain and the lock name that enables debugging for the DLM, run the following command:

echo R 82DA8137A49A47E4B187F74E09FBBB4B M00000000000006672078b84822 > /proc/fs/ocfs2_dlm/debug

8. Examine the debug messages by using the dmesg | tail command, for example:

struct dlm_ctxt: 82DA8137A49A47E4B187F74E09FBBB4B, node=3, key=965960985 lockres: M00000000000006672078b84822, owner=1, state=0 last used: 0, on purge list: no granted queue: type=3, conv=-1, node=3, cookie=11673330234144325711, ast=(empty=y,pend=n), bast=(empty=y,pend=n) converting queue: blocked queue:

The DLM supports three lock modes: no lock (type=0), protected read (type=3), and exclusive (type=5). In the previous example, the lock is mastered by node 1 (owner=1) and node 3 has been granted a protected-read lock on the file-system resource.

9. Use the following command to search for processes that are in an uninterruptable sleep state, which are indicated by the D flag in the STAT column:

ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN

Note that at least one of the processes that are in the uninterruptable sleep state is responsible for the hang on the other node.

If a process is waiting for I/O to complete, the problem could be anywhere in the I/O subsystem, from the block device layer through the drivers, to the disk array. If the hang concerns a user lock (flock()), the problem could lie with the application. If possible, kill the holder of the lock. If the hang is due to lack of memory or fragmented memory, you can free up memory by killing non-essential processes. The most immediate solution is to reset the node that is holding the lock. The DLM recovery process can then clear all of the locks owned by the dead node owned; thus, enabling the cluster to continue to operate. 3.6.5 Configuring the Behavior of Fenced Nodes

If a node with a mounted OCFS2 volume assumes that it is no longer in contact with the other cluster nodes, it removes itself from the cluster. This process is called fencing. Fencing prevents other nodes from

26 OCFS2 Use Cases

hanging when attempting to access resources that are held by the fenced node. By default, a fenced node restarts instead of panicking so that it can quickly rejoin the cluster. Under some circumstances, you might want a fenced node to panic instead of restarting. For example, you might want to use the netconsole command to view the oops stack trace or diagnose the cause of frequent reboots.

To configure a node to panic when it next fences, run the following command on the node after the cluster starts:

echo panic > /sys/kernel/config/cluster/cluster_name/fence_method

In the previous command, cluster_name is the name of the cluster.

To set the value after each system reboot, add this line to the /etc/rc.local file. To restore the default behavior, use the reset value instead of the panic value. 3.7 OCFS2 Use Cases

The following are some typical use cases for OCFS2. 3.7.1 Load Balancing Use Case

You can use OCFS2 nodes to share resources between client systems. For example, the nodes could export a shared file system by using Samba or NFS. To distribute service requests between the nodes, you can use round-robin DNS, a network load balancer; or, you can specify which node should be used on each client. 3.7.2 Oracle Real Application Cluster Use Case

Oracle Real Application Cluster (RAC) uses its own cluster stack, Cluster Synchronization Services (CSS). You can use O2CB in conjunction with CSS, but note that each stack is configured independently for timeouts, nodes, and other cluster settings. You can use OCFS2 to host the voting disk files and the Oracle cluster registry (OCR), but not the grid infrastructure user's home, which must exist on a local file system on each node.

Because both CSS and O2CB use the lowest node number as a tie breaker in quorum calculations, ensure that the node numbers are the same in both clusters. If necessary, edit the O2CB configuration file, / etc/ocfs2/cluster.conf, to make the node numbering consistent. Then, update this file on all of the nodes. The change takes effect when the cluster is restarted. 3.7.3 Oracle Database Use Case

Specify the noatime option when mounting volumes that host Oracle datafiles, control files, redo logs, voting disk, and OCR. The noatime option disables unnecessary updates to the access time on the inodes.

Specify the nointr mount option to prevent signals interrupting I/O transactions that are in progress.

By default, the init.ora parameter filesystemio_options directs the database to perform direct I/O to the Oracle datafiles, control files, and redo logs. You should also specify the datavolume mount option for volumes that contain the voting disk and OCR. Do not specify this option for volumes that host the Oracle user's home directory or Oracle E-Business Suite.

To prevent database blocks from becoming fragmented across a disk, ensure that the file system cluster size is at minimum as large as the database block size, which is typically 8KB. If you specify the file system usage type as datafiles when using the mkfs.ocfs2 command, the file system cluster size is set to 128KB.

27 Oracle Database Use Case

To enable multiple nodes to maximize throughput by concurrently streaming data to an Oracle datafile, OCFS2 deviates from the POSIX standard by not updating the modification time (mtime) on the disk when performing non-extending direct I/O writes. The value of mtime is updated in memory. However, OCFS2 does not write the value to disk unless an application extends or truncates the file or performs a operation to change the file metadata, such as using the touch command. This behavior leads to results with different nodes reporting different time stamps for the same file. Use the following command to view the on-disk timestamp of a file:

sudo debugfs.ocfs2 -R "stat /file_path" device | grep "mtime:"

28 Chapter 4 Managing Samba in Oracle Linux

Table of Contents

4.1 About Samba ...... 29 4.1.1 About Samba Services ...... 29 4.1.2 About the Samba Configuration File ...... 30 4.2 Configuring and Using Samba ...... 31 4.2.1 Testing Samba Configuration by Using the testparm Command ...... 31 4.2.2 Configuring a Samba Stand-Alone Server ...... 31 4.2.3 Configuring a Samba Stand-Alone Server Within a Workgroup ...... 32 4.2.4 Adding a Samba Server as a Member of an AD Domain ...... 33 4.3 Accessing Samba Shares ...... 34 4.3.1 Accessing Samba Shares From a Windows Client ...... 34 4.3.2 Accessing Samba Shares From an Oracle Linux Client ...... 35

This chapter including information about managing Samba in Oracle Linux 8, including tasks for configuring Samba and accessing Samba shares on different platforms.

For information about local file system management in Oracle Linux, see Oracle® Linux 8: Managing Local File Systems. 4.1 About Samba

Samab is an open-source implementation of the Server Message Block (SMB) protocol that enables Oracle Linux to interoperate with Microsoft Windows systems, as both a server and a client.

Samba implements the Distributed Computing Environment Remote Procedure Call (DCE RPC) protocol that is used by Microsoft Windows to provision file and print services for Windows clients. Samba also enables Oracle Linux users to access files on Windows systems and includes capability for integrating with a Windows workgroup, NT4 domain, and an Active Directory (AD) domain.

Samba uses the NetBIOS over TCP/IP protocol, which allows computer applications that depend on the NetBIOS API to work on TCP/IP networks. 4.1.1 About Samba Services

The Samba server consists of three important daemons: smbd, nmbd, and winbindd.

• The smb service enables file sharing and printing services by using the SMB protocol. This service is also responsible for resource locking and for authenticating connecting users. The smb systemd service starts and stops the smbd daemon.

• The nmbd service provides host name and IP resolution by using the NetBIOS over IPv4 protocol. The nmbd service also enables browsing of the SMB network to locate domains, workgroups, hosts, file shares, and printers. The nmb systemd service starts and stops the nmbd daemon.

• The winbindd daemon is a Name Service Switch (NSS) daemon for resolving AD Users and Groups. The daemon enables AD Users to securely access services that are hosted on the Samba server. The winbind systemd service starts and stops the winbindd daemon.

To use the smbd and nmbd services, you need to install the samba package on your system. To use the winbindd service, install the samba-winbind package.

29 About the Samba Configuration File

Note that if you are setting up Samba as a domain member, you must start the winbindd service before starting the smbd service. Otherwise, domain users and groups are not available to the local system.

4.1.2 About the Samba Configuration File

Samba uses the /etc/samba/smb.conf file to manage Samba configuration. This file consists of several sections that you can configure to support the required services for a specific Samba configuration, for example:

[global] security = ADS realm = MYDOM.REALM password server = krbsvr.mydom.com load printers = yes printing = cups printcap name = cups

[printers] comment = All Printers path = /var/spool/samba browseable = no guest ok = yes writable = no printable = yes printer admin = root, @ntadmins, @smbprintadm

[homes] comment = User home directories valid users = @smbusers browsable = no writable = yes guest ok = no

[apps] comment = Shared /usr/local/apps directory path = /usr/local/apps browsable = yes writable = no guest ok = yes

• [global]

Contains settings for the Samba server. In the previous example, the server is assumed to be a member of an AD domain that is running in native mode. Samba relies on tickets issued by the server to authenticate clients who want to access local services.

• [printers]

Specifies support for print services. The path parameter specifies the location of a spooling directory that receives print jobs from Windows clients before submitting them to the local print spooler. Samba advertises all locally configured printers on the server.

• [homes]

Provides a personal share for each user in the smbusers group. The settings for browsable and writable prevent other users from browsing home directories, while allowing full access to valid users.

• [apps]

Specifies a share named apps, which grants Windows users browsing and read-only permission to the / usr/local/apps directory.

30 Configuring and Using Samba

4.2 Configuring and Using Samba

The following tasks describe how to configure and use Samba in Oracle Linux 8. 4.2.1 Testing Samba Configuration by Using the testparm Command

After configuring the /etc/samba/smb.conf file, per the information that is provided in Section 4.1.2, “About the Samba Configuration File”, you can verify your Samba configuration by using the testparm command. The testparm command detects invalid parameters and values, as well as any incorrect settings such as incorrect ID mapping. If the testparm command does not report any problems, the Samba services successfully load the configuration that is specified in the /etc/samba/smb.conf file. Note that the testparm command is not capable of testing whether configured services will be available or work as expected. You should use the testparm command every time you make a change to your Samba configuration.

The following example shows the type of output that might be displayed when you run the testparm command:

Load smb config files from /etc/samba/smb.conf Loaded services file OK. WARNING: The 'netbios name' is too long (max. 15 chars).

Server role: ROLE_STANDALONE

Press enter to see a dump of your service definitions ...

If the testparm command reports any errors or misconfiguration in the /etc/samba/smb.conf file, fix the problem, then re-run the command.

For more information, see the testparm(1) manual page. 4.2.2 Configuring a Samba Stand-Alone Server

To configure a Samba stand-alone server:

1. Install the samba and samba-winbind packages:

sudo dnf install samba samba-winbind

2. Edit the /etc/samba/smb.conf file and configure the various sections to support the services that are required for your specific configuration.

For general information, see Section 4.1.2, “About the Samba Configuration File”

For specific instructions on configuring a Samba stand-alone server within a workgroup, see Section 4.2.3, “Configuring a Samba Stand-Alone Server Within a Workgroup”.

For specific instructions on adding a Samba server as a member of an AD domain, see Section 4.2.4, “Adding a Samba Server as a Member of an AD Domain”.

3. (Optional) Configure file system sharing, as needed.

See Section 4.3.1, “Accessing Samba Shares From a Windows Client” and Section 4.3.2, “Accessing Samba Shares From an Oracle Linux Client” for instructions.

4. Test the configuration by running the testparm command.

31 Configuring a Samba Stand-Alone Server Within a Workgroup

If the command returns any errors or reports a misconfiguration, manually fix the errors in the /etc/ samba/smb.conf file and then re-run the command. See Section 4.2.1, “Testing Samba Configuration by Using the testparm Command” for more information.

5. Configure the system firewall to enable incoming TCP connections to ports 139 and 445 and incoming UDP datagrams to ports 137 and 138:

sudo firewall-cmd --zone=zone --add-port=139/tcp --add-port=445/tcp --add-port=137-138/udp

sudo firewall-cmd --permanent --zone=zone --add-port=139/tcp --add-port=445/tcp --add-port=137-138/udp

The nmdb daemon services NetBIOS Name Service requests on UDP port 137 and NetBIOS Datagram Service requests on UDP port 138.

The default ports are 139 (used for SMB over NetBIOS over TCP) and port 445 (used for plain SMB over TCP).

6. (Optional) Add similar rules for other networks from which Samba clients can connect, as required.

7. Start and enable the smb service so the service starts following a system reboot:

sudo systemctl start smb

sudo systemctl enable smb

Note

If you make changes to the /etc/samba/smb.conf file and any files that this file references, the smb service reloads the configuration automatically, after a delay of up to one minute. If necessary, you can force the smb service to reload the new configuration by sending a SIGHUP signal to the service daemon:

sudo killall -SIGHUP smbd

Making the smb service reload its configuration has no effect on any established connections. You must restart the smb service. Otherwise, any existing users of the service must disconnect and then reconnect.

For more information, see the smb.conf(5) and smbd(8) manual pages. See also http:// www.samba.org/samba/docs/. 4.2.3 Configuring a Samba Stand-Alone Server Within a Workgroup

Windows systems that are on an enterprise network usually belong to either a workgroup or a domain.

Workgroups are usually only configured on networks that connect a small number of . A workgroup environment is a peer-to-peer network, where the systems do not rely on each other for services and there is no centralized management. User accounts, access control, and system resources are configured independently of each system. Note that such systems can share resources only if they are configured to do so.

A Samba server can act as a stand-alone server within a workgroup. To configure a stand-alone Samba server within a workgroup, follow the instructions inSection 4.2.2, “Configuring a Samba Stand-Alone Server”. For Step 2 of the procedure, configure the settings in the /etc/samba/smb.conf file as follows:

Configure the [global] section by using share-level security, as follows:

[global]

32 Adding a Samba Server as a Member of an AD Domain

security = share workgroup = workgroup_name netbios name = netbios_name

The client provides only a password to the server, and not a user name. Typically, each share is associated with a valid users parameter. The server validates the password against the hashed passwords that are stored in the /etc/passwd and /etc/shadow files, NIS, or LDAP for the listed users. Note that user-level security is preferred over share-level security, as shown in the following example:

[global] security = user workgroup = workgroup_name netbios name = netbios_name

In the user security model, a client must supply a valid user name and a password. This model supports encrypted passwords. If the server successfully validates the client's user name and password, the client can mount multiple shares without being required to specify a password.

Use the smbpasswd command to create an entry for a user in the Samba password file, for example:

smbpasswd -a guest

New SMB password: password Retype new SMB password: password Added user guest.

Note

The user must already exist on the system. If permitted to log in to the server, the user can use the smbpasswd command to change his or her password.

If a Windows user has a user name that is different from the user name on the Samba server, create a mapping between these names in the /etc/samba/smbusers file, for example:

root = admin administrator root nobody = guest nobody pcguest smbguest eddie = ejones fiona = fchau

In the previous example, the first entry for each line is the user name on the Samba server. The entries that appear after the equal sign (=) are the equivalent to Windows user names.

Note

Only the user security model uses Samba passwords.

The server security model, where the Samba server relies on another server to authenticate user names and passwords, is deprecated. This model has numerous security and interoperability issues. 4.2.4 Adding a Samba Server as a Member of an AD Domain

Typically, corporate networks configure domains to enable large numbers of networked systems to be administered centrally. A domain is a group of trusted computers that share security and access control. Systems that are known as domain controllers provide centralized management and security. Windows domains are usually configured to use AD, which uses the Lightweight Directory Access Protocol (LDAP) to implement versions of Kerberos and DNS by providing authentication, access control to domain resources, and name services. Some Windows domains use Windows NT4 security, which does not use Kerberos to perform authentication.

33 Accessing Samba Shares

Note

A Samba server can be a member of an AD or NT4 security domain, but it cannot operate as a domain controller. As a domain member, a Samba server must authenticate itself with a domain controller; thus, it is controlled by the security rules of the domain. The domain controller authenticates clients, while the Samba server controls access to printers and network shares.

In the Activity Directory Server (ADS) security model, Samba acts as a domain member server in an ADS realm. Clients use Kerberos tickets for AD authentication. You must first configure Kerberos and then join the server to the domain, which creates a machine account for your server on the domain controller.

To add a Samba server to an AD domain:

1. Edit /etc/samba/smb.conf and configure the [global] section to use ADS, for example:

[global] security = ADS realm = KERBEROS.REALM

You might also have to specify the password server explicitly if different servers support AD services and Kerberos authentication:

password server = kerberos_server.your_domain

2. Install the krb5-workstation package:

sudo dnf install krb5-workstation

3. Create a Kerberos ticket for the Administrator account in the Kerberos domain, for example:

sudo kinit [email protected]

This command creates the Kerberos ticket that is required to join the server to the AD domain.

4. Join the server to the AD domain:

sudo net ads join -S winads.mydom.com -U Administrator%password

In the previous example, the AD server is winads.mydom.com and password is the password for the Administrator account.

The command creates an machine account in Active Directory for the Samba server and enables it to join the domain.

5. Restart the smb service:

sudo systemctl restart smb 4.3 Accessing Samba Shares

The following tasks describe how to access Samba shares from a Windows client and an Oracle Linux client. 4.3.1 Accessing Samba Shares From a Windows Client

To access a share on a Samba server from Windows, open Computer or Windows Explorer, and enter the host name of the Samba server and the share name using the following format:

34 Accessing Samba Shares From an Oracle Linux Client

\\server_name\share_name

If you enter \\server_name, Windows displays the directories and printers that the server is sharing. You can also use the same syntax to map a network drive to a share name. 4.3.2 Accessing Samba Shares From an Oracle Linux Client

Note

To use the following commands, the samba-client and cifs-utils packages must be installed on the system.

You can use the findsmb command to query a subnet for Samba servers. The command displays the IP address, NetBIOS name, workgroup, operating system and version for each server that it finds.

Alternatively, you can use the smbtree command, which is a text-based SMB network browser that displays the hierarchy of known domains, the servers in those domains, and the shares on those servers.

The GNOME desktop provides browser-based file managers that you can use to view Windows shares on the network. Enter smb: in the location bar of a to browse network shares.

To connect to a Windows share by using the command line, use the smbclient command:

sudo smbclient //server_name/share_name [-U username]

After logging in, enter help at the smb:\> prompt to display a list of available commands.

To mount a Samba share, use a command similar to the following:

sudo mount -t cifs //server_name/share_name mountpoint -o credentials=credfile

In the previous command, the credentials file contains settings for username, password, and domain:

username=eddie password=clydenw domain=MYDOMWKG

The argument to domain can be the name of a domain or a workgroup.

Caution

Because the credentials file contains a plain-text password, use chmod to make it readable by only you, for example:

sudo chmod 400 credfile

If the Samba server is a domain member server in an AD domain, and your current login session was authenticated by the Kerberos server in the domain, you can use your existing session credentials by specifying the sec=krb5 option instead of a credentials file:

sudo mount -t cifs //server_name/share_name mountpoint -o sec=krb5

For more information, see the findsmb(1), mount.cifs(8), smbclient(1), and smbtree(1) manual pages.

35 36