IBM AIX Continuous Availability Features
Total Page:16
File Type:pdf, Size:1020Kb
Front cover IBM AIX Continuous Availability Features Learn about AIX V6.1 and POWER6 advanced availability features View sample programs that exploit storage protection keys Harden your AIX system Octavian Lascu Shawn Bodily Matti Harvala Anil K Singh DoYoung Song Frans Van Den Berg ibm.com/redbooks Redpaper International Technical Support Organization IBM AIX Continuous Availability Features April 2008 REDP-4367-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii. First Edition (April 2008) This edition applies to Version 6, Release 1, of AIX Operating System (product number 5765-G62). © Copyright International Business Machines Corporation 2008. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . vii Trademarks . viii Preface . ix The team that wrote this paper . ix Become a published author . xi Comments welcome. xii Chapter 1. Introduction. 1 1.1 Overview . 2 1.2 Business continuity . 2 1.3 Disaster recovery . 3 1.4 High availability . 3 1.5 Continuous operations . 4 1.6 Continuous availability . 4 1.6.1 Reliability. 5 1.6.2 Availability . 6 1.6.3 Serviceability. 6 1.7 First Failure Data Capture. 7 1.8 IBM AIX continuous availability strategies . 8 Chapter 2. AIX continuous availability features . 11 2.1 System availability. 12 2.1.1 Dynamic Logical Partitioning. 12 2.1.2 CPU Guard . 12 2.1.3 CPU Sparing . 12 2.1.4 Predictive CPU deallocation and dynamic processor deallocation . 13 2.1.5 Processor recovery and alternate processor . 13 2.1.6 Excessive interrupt disablement detection . 13 2.1.7 Memory page deallocation . 14 2.1.8 System Resource Controller . 14 2.1.9 PCI hot plug management . 15 2.1.10 Reliable Scalable Cluster Technology . 15 2.1.11 Dual IBM Virtual I/O Server . 17 2.1.12 Special Uncorrectable Error handling . 18 2.1.13 Automated system hang recovery. 19 2.1.14 Recovery framework . 19 2.2 System reliability . 19 2.2.1 Error checking. 20 2.2.2 Extended Error Handling. 20 2.2.3 Paging space verification . 21 2.2.4 Storage keys . 21 2.3 System serviceability. 23 2.3.1 Advanced First Failure Data Capture features . 23 2.3.2 Traditional system dump. 23 2.3.3 Firmware-assisted system dump . 24 2.3.4 Live dump and component dump . 25 2.3.5 The dumpctrl command . 26 2.3.6 Parallel dump . 27 © Copyright IBM Corp. 2008. All rights reserved. iii 2.3.7 Minidump . 27 2.3.8 Trace (system trace) . 27 2.3.9 Component Trace facility . 29 2.3.10 Lightweight Memory Trace (LMT) . 30 2.3.11 ProbeVue . 30 2.3.12 Error logging . 30 2.3.13 The alog facility . 31 2.3.14 syslog . 32 2.3.15 Concurrent AIX Update. 34 2.3.16 Core file control. 35 2.4 Network tools . 36 2.4.1 Virtual IP address support (VIPA) . 37 2.4.2 Multipath IP routing . 37 2.4.3 Dead gateway detection . 37 2.4.4 EtherChannel . 38 2.4.5 IEEE 802.3ad Link Aggregation . 39 2.4.6 2-Port Adapter-based Ethernet failover. 40 2.4.7 Shared Ethernet failover . 40 2.5 Storage tools . 40 2.5.1 Hot swap disks . 40 2.5.2 System backup (mksysb) . 40 2.5.3 Alternate disk installation . ..