SUSE Linux Enterprise High Availability
Total Page:16
File Type:pdf, Size:1020Kb
SUSE Linux Enterprise High Availability Extension 11 SP4 High Availability Guide High Availability Guide SUSE Linux Enterprise High Availability Extension 11 SP4 by Tanja Roth and Thomas Schraitle Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE or Novell trademarks, see the Novell Trademark and Service Mark list http://www.novell.com/ company/legal/trademarks/tmlist.html . All other third party trademarks are the property of their respective owners. A trademark symbol (®, ™ etc.) denotes a SUSE or Novell trademark; an asterisk (*) denotes a third party trademark. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xiii 1 Feedback xiv 2 Documentation Conventions xiv 3 About the Making of This Manual xv I INSTALLATION AND SETUP 1 1 Product Overview 2 1.1 Availability as Add-On/Extension 2 1.2 Key Features 3 Wide Range of Clustering Scenarios 3 • Flexibility 3 • Storage and Data Replication 4 • Support for Virtualized Environments 4 • Support of Local, Metro, and Geo Clusters 4 • Resource Agents 5 • User-friendly Administration Tools 5 1.3 Benefits 6 1.4 Cluster Configurations: Storage 10 1.5 Architecture 12 Architecture Layers 12 • Process Flow 15 2 System Requirements and Recommendations 16 2.1 Hardware Requirements 16 2.2 Software Requirements 17 2.3 Storage Requirements 17 2.4 Other Requirements and Recommendations 17 iii High Availability Guide 3 Installation and Basic Setup 21 3.1 Definition of Terms 21 3.2 Overview 23 3.3 Installation as Add-on 25 3.4 Automatic Cluster Setup (sleha-bootstrap) 26 3.5 Manual Cluster Setup (YaST) 30 YaST Cluster Module 30 • Defining the Communication Channels 31 • Defining Authentication Settings 35 • Transferring the Configuration to All Nodes 36 • Synchronizing Connection Status Between Cluster Nodes 40 • Configuring Services 41 • Bringing the Cluster Online 43 3.6 Mass Deployment with AutoYaST 44 II CONFIGURATION AND ADMINISTRATION 46 4 Configuration and Administration Basics 47 4.1 Global Cluster Options 47 Overview 47 • Option no-quorum-policy 48 • Option stonith- enabled 49 4.2 Cluster Resources 49 Resource Management 49 • Supported Resource Agent Classes 50 • Types of Resources 51 • Resource Templates 52 • Advanced Resource Types 53 • Resource Options (Meta Attributes) 56 • Instance Attributes (Parameters) 59 • Resource Operations 62 • Timeout Values 63 4.3 Resource Monitoring 64 4.4 Resource Constraints 66 Types of Constraints 66 • Scores and Infinity 69 • Resource Templates and Constraints 70 • Failover Nodes 70 • Failback Nodes 72 • Placing Resources Based on Their Load Impact 72 • Grouping Resources by Using Tags 76 iv High Availability Guide 4.5 Managing Services on Remote Hosts 76 Monitoring Services on Remote Hosts with Nagios Plug-ins 76 • Managing Services on Remote Nodes with pacemaker_remote 78 4.6 Monitoring System Health 79 4.7 Maintenance Mode 80 4.8 For More Information 82 5 Configuring and Managing Cluster Resources (Web Interface) 83 5.1 Hawk—Overview 83 Starting Hawk and Logging In 83 • Main Screen: Cluster Status 85 5.2 Configuring Global Cluster Options 89 5.3 Configuring Cluster Resources 91 Configuring Resources with the Setup Wizard 92 • Creating Simple Cluster Resources 93 • Creating STONITH Resources 95 • Using Resource Templates 96 • Configuring Resource Constraints 98 • Specifying Resource Failover Nodes 102 • Specifying Resource Failback Nodes (Resource Stickiness) 103 • Configuring Placement of Resources Based on Load Impact 104 • Configuring Resource Monitoring 105 • Configuring a Cluster Resource Group 107 • Configuring a Clone Resource 108 5.4 Managing Cluster Resources 109 Starting Resources 110 • Cleaning Up Resources 111 • Removing Cluster Resources 112 • Migrating Cluster Resources 112 • Using Maintenance Mode 114 • Viewing the Cluster History 116 • Exploring Potential Failure Scenarios 118 • Generating a Cluster Report 121 5.5 Monitoring Multiple Clusters 122 5.6 Hawk for Geo Clusters 124 5.7 Troubleshooting 124 v High Availability Guide 6 Configuring and Managing Cluster Resources (GUI) 126 6.1 Pacemaker GUI—Overview 126 Logging in to a Cluster 126 • Main Window 127 6.2 Configuring Global Cluster Options 129 6.3 Configuring Cluster Resources 130 Creating Simple Cluster Resources 130 • Creating STONITH Resources 133 • Using Resource Templates 134 • Configuring Resource Constraints 135 • Specifying Resource Failover Nodes 139 • Specifying Resource Failback Nodes (Resource Stickiness) 140 • Configuring Placement of Resources Based on Load Impact 141 • Configuring Resource Monitoring 144 • Configuring a Cluster Resource Group 145 • Configuring a Clone Resource 149 6.4 Managing Cluster Resources 149 Starting Resources 150 • Cleaning Up Resources 151 • Removing Cluster Resources 152 • Migrating Cluster Resources 152 • Changing Management Mode of Resources 154 7 Configuring and Managing Cluster Resources (Command Line) 155 7.1 crmsh—Overview 155 Getting Help 156 • Executing crmsh's Subcommands 157 • Displaying Information about OCF Resource Agents 159 • Using Configuration Templates 160 • Testing with Shadow Configuration 162 • Debugging Your Configuration Changes 163 • Cluster Diagram 163 7.2 Configuring Global Cluster Options 164 7.3 Configuring Cluster Resources 165 Creating Cluster Resources 165 • Creating Resource Templates 166 • Creating a STONITH Resource 167 • Configuring Resource Constraints 168 • Specifying Resource Failover Nodes 171 • Specifying Resource Failback Nodes (Resource Stickiness) 171 • Configuring Placement of Resources Based on Load vi High Availability Guide Impact 171 • Configuring Resource Monitoring 174 • Configuring a Cluster Resource Group 175 • Configuring a Clone Resource 175 7.4 Managing Cluster Resources 176 Starting a New Cluster Resource 177 • Cleaning Up Resources 177 • Removing a Cluster Resource 178 • Migrating a Cluster Resource 178 • Grouping/Tagging Resources 179 • Using Maintenance Mode 179 7.5 Setting Passwords Independent of cib.xml 180 7.6 Retrieving History Information 181 7.7 For More Information 182 8 Adding or Modifying Resource Agents 183 8.1 STONITH Agents 183 8.2 Writing OCF Resource Agents 183 8.3 OCF Return Codes and Failure Recovery 185 9 Fencing and STONITH 187 9.1 Classes of Fencing 187 9.2 Node Level Fencing 188 STONITH Devices 188 • STONITH Implementation 189 9.3 STONITH Resources and Configuration 190 Example STONITH Resource Configurations 190 9.4 Monitoring Fencing Devices 193 9.5 Special Fencing Devices 193 9.6 Basic Recommendations 195 9.7 For More Information 196 10 Access Control Lists 197 10.1 Requirements and Prerequisites 197 vii High Availability Guide 10.2 Enabling Use of ACLs In Your Cluster 198 10.3 The Basics of ACLs 199 Setting ACL Rules via XPath Expressions 199 • Setting ACL Rules via Abbreviations 201 10.4 Configuring ACLs with the Pacemaker GUI 202 10.5 Configuring ACLs with Hawk 203 10.6 Configuring ACLs with crmsh 204 11 Network Device Bonding 206 11.1 Configuring Bonding Devices with YaST 206 11.2 Hotplugging of Bonding Slaves 209 11.3 For More Information 211 12 Load Balancing with Linux Virtual Server 212 12.1 Conceptual Overview 212 Director 212 • User Space Controller and Daemons 213 • Packet Forwarding 213 • Scheduling Algorithms 214 12.2 Configuring IP Load Balancing with YaST 214 12.3 Further Setup 220 12.4 For More Information 220 13 Geo Clusters (Multi-Site Clusters) 221 III STORAGE AND DATA REPLICATION 222 14 OCFS2 223 14.1 Features and Benefits 223 14.2 OCFS2 Packages and Management Utilities 224 14.3 Configuring OCFS2 Services and a STONITH Resource 225 14.4 Creating OCFS2 Volumes 227 viii High Availability Guide 14.5 Mounting OCFS2 Volumes 229 14.6 Using Quotas on OCFS2 File Systems 231 14.7 For More Information 231 15 DRBD 232 15.1 Conceptual Overview 232 15.2 Installing DRBD Services 234 15.3 Configuring the DRBD Service 235 15.4 Testing the DRBD Service 241 15.5 Tuning DRBD 243 15.6 Troubleshooting DRBD 243 Configuration 244 • Host Names 244 • TCP Port 7788 244 • DRBD Devices Broken after Reboot 245 15.7 For More Information 245 16 Cluster Logical Volume Manager (cLVM) 246 16.1 Conceptual Overview 246 16.2 Configuration of cLVM 247 Configuring Cmirrord 247 • Creating the Cluster Resources 250 • Scenario: cLVM With iSCSI on SANs 251 • Scenario: cLVM With DRBD 255 16.3 Configuring Eligible LVM2 Devices Explicitly 257 16.4 For More Information 258 17 Storage Protection 259 17.1 Storage-based Fencing 259 Overview 260 • Number of SBD Devices 260 • Setting Up Storage-based Protection 261 17.2 Ensuring Exclusive Storage Activation 268 Overview 268 • Setup 268 ix High Availability Guide 17.3 For More Information 270 18 Samba Clustering 271 18.1 Conceptual Overview 271 18.2 Basic Configuration 272 18.3 Joining an Active Directory Domain 275 18.4 Debugging and Testing Clustered Samba 276 18.5 For More Information 278 19 Disaster Recovery