Administration Guide Administration Guide SUSE Linux Enterprise High Availability Extension 15 SP1 by Tanja Roth and Thomas Schraitle
Total Page:16
File Type:pdf, Size:1020Kb
SUSE Linux Enterprise High Availability Extension 15 SP1 Administration Guide Administration Guide SUSE Linux Enterprise High Availability Extension 15 SP1 by Tanja Roth and Thomas Schraitle This guide is intended for administrators who need to set up, congure, and maintain clusters with SUSE® Linux Enterprise High Availability Extension. For quick and ecient conguration and administration, the product includes both a graphical user interface and a command line interface (CLI). For performing key tasks, both approaches are covered in this guide. Thus, you can choose the appropriate tool that matches your needs. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006–2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see http://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xv 1 Available Documentation xvi 2 Giving Feedback xvii 3 Documentation Conventions xviii 4 Product Life Cycle and Support xix Support Statement for SUSE Linux Enterprise High Availability Extension xx • Technology Previews xxi I INSTALLATION, SETUP AND UPGRADE 1 1 Product Overview 2 1.1 Availability as Extension 2 1.2 Key Features 3 Wide Range of Clustering Scenarios 3 • Flexibility 3 • Storage and Data Replication 4 • Support for Virtualized Environments 4 • Support of Local, Metro, and Geo Clusters 4 • Resource Agents 5 • User-friendly Administration Tools 5 1.3 Benefits 6 1.4 Cluster Configurations: Storage 10 1.5 Architecture 12 Architecture Layers 12 • Process Flow 14 2 System Requirements and Recommendations 15 2.1 Hardware Requirements 15 2.2 Software Requirements 16 2.3 Storage Requirements 17 iii Administration Guide 2.4 Other Requirements and Recommendations 18 3 Installing the High Availability Extension 20 3.1 Manual Installation 20 3.2 Mass Installation and Deployment with AutoYaST 20 4 Using the YaST Cluster Module 23 4.1 Definition of Terms 23 4.2 YaST Cluster Module 25 4.3 Defining the Communication Channels 27 4.4 Defining Authentication Settings 32 4.5 Transferring the Configuration to All Nodes 33 Configuring Csync2 with YaST 34 • Synchronizing Changes with Csync2 35 4.6 Synchronizing Connection Status Between Cluster Nodes 37 4.7 Configuring Services 38 4.8 Bringing the Cluster Online Node by Node 40 5 Upgrading Your Cluster and Updating Software Packages 41 5.1 Terminology 41 5.2 Upgrading your Cluster to the Latest Product Version 42 Supported Upgrade Paths for SLE HA and SLE HA Geo 43 • Required Preparations Before Upgrading 46 • Cluster Offline Upgrade 47 • Cluster Rolling Upgrade 50 5.3 Updating Software Packages on Cluster Nodes 53 5.4 For More Information 54 iv Administration Guide II CONFIGURATION AND ADMINISTRATION 55 6 Configuration and Administration Basics 56 6.1 Use Case Scenarios 56 6.2 Quorum Determination 57 Global Cluster Options 58 • Global Option no-quorum- policy 58 • Global Option stonith-enabled 59 • Corosync Configuration for Two-Node Clusters 59 • Corosync Configuration for N- Node Clusters 60 6.3 Cluster Resources 61 Resource Management 61 • Supported Resource Agent Classes 62 • Types of Resources 64 • Resource Templates 64 • Advanced Resource Types 65 • Resource Options (Meta Attributes) 68 • Instance Attributes (Parameters) 71 • Resource Operations 73 • Timeout Values 75 6.4 Resource Monitoring 76 6.5 Resource Constraints 78 Types of Constraints 78 • Scores and Infinity 81 • Resource Templates and Constraints 82 • Failover Nodes 83 • Failback Nodes 84 • Placing Resources Based on Their Load Impact 85 • Grouping Resources by Using Tags 88 6.6 Managing Services on Remote Hosts 88 Monitoring Services on Remote Hosts with Monitoring Plug- ins 89 • Managing Services on Remote Nodes with pacemaker_remote 90 6.7 Monitoring System Health 91 6.8 For More Information 93 7 Configuring and Managing Cluster Resources with Hawk2 94 7.1 Hawk2 Requirements 94 7.2 Logging In 95 v Administration Guide 7.3 Hawk2 Overview: Main Elements 96 Left Navigation Bar 97 • Top-Level Row 98 7.4 Configuring Global Cluster Options 99 7.5 Configuring Cluster Resources 101 Showing the Current Cluster Configuration (CIB) 102 • Adding Resources with the Wizard 102 • Adding Simple Resources 103 • Adding Resource Templates 105 • Modifying Resources 105 • Adding STONITH Resources 107 • Adding Cluster Resource Groups 108 • Adding Clone Resources 110 • Adding Multi-state Resources 111 • Grouping Resources by Using Tags 112 • Configuring Resource Monitoring 113 7.6 Configuring Constraints 116 Adding Location Constraints 116 • Adding Colocation Constraints 117 • Adding Order Constraints 119 • Using Resource Sets for Constraints 121 • For More Information 122 • Specifying Resource Failover Nodes 123 • Specifying Resource Failback Nodes (Resource Stickiness) 124 • Configuring Placement of Resources Based on Load Impact 125 7.7 Managing Cluster Resources 128 Editing Resources and Groups 128 • Starting Resources 129 • Cleaning Up Resources 130 • Removing Cluster Resources 130 • Migrating Cluster Resources 131 7.8 Monitoring Clusters 132 Monitoring a Single Cluster 132 • Monitoring Multiple Clusters 133 7.9 Using the Batch Mode 137 7.10 Viewing the Cluster History 141 Viewing Recent Events of Nodes or Resources 141 • Using the History Explorer for Cluster Reports 142 • Viewing Transition Details in the History Explorer 144 7.11 Verifying Cluster Health 146 vi Administration Guide 8 Configuring and Managing Cluster Resources (Command Line) 147 8.1 crmsh—Overview 147 Getting Help 148 • Executing crmsh's Subcommands 149 • Displaying Information about OCF Resource Agents 151 • Using crmsh's Shell Scripts 152 • Using crmsh's Cluster Scripts 153 • Using Configuration Templates 156 • Testing with Shadow Configuration 158 • Debugging Your Configuration Changes 159 • Cluster Diagram 159 8.2 Managing Corosync Configuration 159 8.3 Configuring Cluster Resources 160 Loading Cluster Resources from a File 161 • Creating Cluster Resources 161 • Creating Resource Templates 162 • Creating a STONITH Resource 163 • Configuring Resource Constraints 164 • Specifying Resource Failover Nodes 167 • Specifying Resource Failback Nodes (Resource Stickiness) 167 • Configuring Placement of Resources Based on Load Impact 168 • Configuring Resource Monitoring 170 • Configuring a Cluster Resource Group 171 • Configuring a Clone Resource 171 8.4 Managing Cluster Resources 173 Showing Cluster Resources 173 • Starting a New Cluster Resource 174 • Stopping a Cluster Resource 175 • Cleaning Up Resources 175 • Removing a Cluster Resource 176 • Migrating a Cluster Resource 176 • Grouping/Tagging Resources 177 • Getting Health Status 177 8.5 Setting Passwords Independent of cib.xml 178 8.6 Retrieving History Information 178 8.7 For More Information 180 9 Adding or Modifying Resource Agents 181 9.1 STONITH Agents 181 9.2 Writing OCF Resource Agents 181 9.3 OCF Return Codes and Failure Recovery 183 vii Administration Guide 10 Fencing and STONITH 185 10.1 Classes of Fencing 185 10.2 Node Level Fencing 186 STONITH Devices 186 • STONITH Implementation 187 10.3 STONITH Resources and Configuration 188 Example STONITH Resource Configurations 188 10.4 Monitoring Fencing Devices 192 10.5 Special Fencing Devices 192 10.6 Basic Recommendations 194 10.7 For More Information 195 11 Storage Protection and SBD 196 11.1 Conceptual Overview 196 11.2 Overview of Manually Setting Up SBD 198 11.3 Requirements 198 11.4 Number of SBD Devices 199 11.5 Calculation of Timeouts 200 11.6 Setting Up the Watchdog 201 Using a Hardware Watchdog 201 • Using the Software Watchdog (softdog) 203 11.7 Setting Up SBD with Devices 204 11.8 Setting Up Diskless SBD 209 11.9 Testing SBD and Fencing 211 11.10 Additional Mechanisms for Storage Protection 212 Configuring an sg_persist Resource 212 • Ensuring Exclusive Storage Activation with sfex 214 11.11 For More Information 216 viii Administration Guide 12 QDevice and QNetd 217 12.1 Conceptual Overview 217 12.2 Requirements and Prerequisites 219 12.3 Setting Up the QNetd Server 219 12.4 Connecting QDevice Clients to the QNetd Server 220 12.5 Setting Up a QDevice with Heuristics 220 12.6 Checking and Showing Quorum Status 221 12.7 For More Information 223 13 Access Control Lists 224 13.1 Requirements and Prerequisites 224 13.2 Conceptual Overview 225 13.3 Enabling Use of ACLs in Your Cluster 225 13.4 Creating a Read-Only Monitor Role 226 Creating a Read-Only Monitor Role with Hawk2 226 • Creating a Read-Only Monitor Role with crmsh 228 13.5 Removing a User 229 Removing a User with Hawk2 229 • Removing a User with crmsh 230 13.6 Removing an Existing Role 230 Removing an Existing Role with Hawk2 230 • Removing an Existing Role with crmsh 231 13.7 Setting ACL Rules via XPath Expressions