The Compaq Health & Wellness Driver

The Compaq ProLiant Team

May 31, 2001

This guide was designed to facilitate the installation and use of the Compaq Health and Wellness Driver on various Linux distributions on Compaq ProLiant Servers.

Notice

© 2001 Compaq Computer Corporation

Compaq, Compaq Insight Manager, NetFlex, NonStop, ProLiant, ROMPaq, and SmartStart are registered United States Patent and Trademark Office.

Alpha, AlphaServer, AlphaStation, ProSignia, and SoftPaq are trademarks and/or service marks of Compaq Computer Corporation.

Netelligent is a trademark and/or service mark of Compaq Information Technologies Group, L.P. in the U.S. and/or other countries.

Microsoft, MS-DOS, Windows, and Windows NT are trademarks and/or registered trademarks of Corporation.

UNIX is a registered trademark of The Open Group.

SCO, UnixWare, OpenServer 5, and UnixWare 7 are registered trademarks of the Santa Cruz Operation.

Linux is a registered trademark of .

Red Hat is a registered trademark of Red Hat, Inc.

Caldera Systems and OpenLinux are either registered trademarks or trademarks of Caldera Systems.

TurboLinux is a trademark of Turbo Linux, Inc.

SuSE is a registered trademark of SuSE AG.

Other product names mentioned herein may be trademarks and/or registered trademarks of their respective companies.

The information in this publication is subject to change without notice and is provided "AS IS" WITHOUT WARRANTY OF ANY KIND. THE ENTIRE RISK ARISING OUT OF THE USE OF THIS INFORMATION REMAINS WITH RECIPIENT. IN NO EVENT SHALL COMPAQ BE LIABLE FOR ANY DIRECT CONSEQUENTIAL, INCIDENTAL, SPECIAL, PUNITIVE OR OTHER DAMAGES WHATSOEVER (INCLUDING WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION OR LOSS OF BUSINESS INFORMATION), EVEN IF COMPAQ HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

The limited warranties for Compaq products are exclusively set forth in the documentation accompanying such products. Nothing herein should be construed as constituting a further or additional warranty.

This publication does not constitute an endorsement of the product or products that were tested. The configuration or configurations tested or described may or may not be the only available solution. This test

1 is not a determination of product quality or correctness, nor does it ensure compliance with any federal state or local requirements.

Compaq Health & Wellness Driver How-To

Solution Guide prepared by Compaq ProLiant Linux Team

Second Edition (November 2000)

Third Edition (December 2000)

Fourth Edition (February 2001)

Fifth Edition (April 2001)

Sixth Edition (May 2001)

2 Abstract

Compaq has created many different tools for managing Compaq servers, a key component of which is the health and wellness driver. This document describes the features of the health and wellness driver for linux, how it can be installed and how information can be leveraged. Contents 1 WHAT IS THE HEALTH AND WELLNESS DRIVER?...... 4 1.1 Exposing the Health log into /proc...... 4 1.2 System Temperature Monitoring...... 4 1.3 System Fan Monitoring ...... 4 1.4 Monitoring the System Fault Tolerant Power Supply ...... 4 1.5 ECC Memory Monitoring...... 5 1.6 Automatic Server Recovery (ASR)...... 5 2 SETUP PROCEDURE ...... 5 2.1 Install...... 5 2.2 Upgrading the Driver...... 7 2.3 Running the Driver ...... 8 2.4 Uninstall ...... 8 2.5 Behind the Scenes ...... 9 3 CONSOLE MESSAGES ...... 9 3.1 Memory...... 9 3.2 Thermal Sensors (Temperature)...... 9 3.3 Fans...... 10 3.4 Power Supplies ...... 10 3.5 Processor Power Modules...... 11 4 INFORMATION RETRIEVAL ...... 11 4.1 Temperature ...... 11 4.2 Fan ...... 12 4.3 Power Supply...... 12 4.4 Integrated Management Log (IML) ...... 13 5 COMPAQ INTEGRATED MANAGEMENT LOG VIEWER (IML VIEWER) ...... 14 5.1 Running the IML Viewer...... 14 5.2 File Menu...... 15 5.3 Log Menu ...... 16 5.4 View Menu ...... 18 6 TROUBLESHOOTING ...... 19 6.1 Non Certified Machines...... 19 6.2 Health Driver Immediately Stops after Installation...... 19 6.3 No Console Messages...... 20 6.4 Failed Dependencies...... 20 6.5 Failure in cpqimlview ...... 22 6.6 Superuser Only...... 22

3 1 What is the Health and Wellness Driver?

The Compaq Wellness Driver (cpqhealth.o) collects and monitors important operational data on your server to ensure that the system is "healthy". Any abnormal conditions are logged into a non-volatile Health Log and can be inspected by using certain /proc entries.

Compaq Servers are equipped with hardware sensors and firmware to monitor certain abnormal conditions such as abnormal temperature readings, fan failures, ECC memory errors, etc. The cpqhealth.o driver monitors these conditions and reports them to the administrator by printing messages on the console (preserved in /var/log/messages), and also logging the condition into the server's health log.

The following is a list of the features of the Compaq Wellness Driver:

1.1 Exposing the Health log into /proc

Most events trigger a log entry into a Compaq internal area of non volatile memory (NVRAM). This health log is exposed through the Compaq Wellness Driver into the /proc filesystem.

1.2 System Temperature Monitoring

A Compaq server may contain several temperature sensors. If the normal operating range is exceeded for any of these sensors, the Compaq Wellness Driver does the following:

• Displays a message to the console stating the problem

• Makes an entry in the system health log

• Shuts the system down (optionally) to avoid hardware damage

Use the Compaq System Configuration Utility to control the shutdown option.

1.3 System Fan Monitoring

If a cooling fan fails, the Compaq Wellness Driver does the following:

• Displays a message to the console stating the problem

• Makes an entry in the system health log

• Shuts the system down (optionally) to avoid hardware damage

Use the Compaq System Configuration Utility to control the shutdown option.

1.4 Monitoring the System Fault Tolerant Power Supply

4 If a primary power supply fails, the server automatically switches over to a backup power supply. The system wellness driver does the following:

• Displays a message to the console stating the problem

• Makes an entry in the system health log

1.5 ECC Memory Monitoring

If a correctable ECC memory error occurs, the driver logs the error in the health log including the memory address causing the error. If too many errors occur at the same memory location, the driver disables the ECC error interrupts to prevent flooding the console with warnings (the hardware automatically corrects the ECC error).

1.6 Automatic Server Recovery (ASR)

The Automatic Server Recovery is implemented using a "heartbeat" timer that continually counts down. The driver frequently reloads the counter to prevent it from counting down to zero. If the ASR counts down to 0, it is assumed that the is locked up and the system automatically attempts to reboot. Before rebooting, the driver does the following:

• Displays a message on the console stating the problem

• Makes an entry in the system health log.

This server feature is configured using the Compaq System Configuration Utility

2 Setup Procedure

The health and wellness driver is available as a Red Hat Package Manager file (RPM). As with every RPM file the following options are available: you may install, query, refresh and uninstall the package. For the remainder of this section, we discuss how to install and uninstall the package (for more information about RPM files, see the appropriate How-To document). We also show you how the driver should react during regular operation.

2.1 Install

If you have a previous version of the Health Driver installed, it is important to uninstall this version before installing the new RPM file. See section 2.3 for information on uninstalling the driver.

After obtaining the RPM file, login as root and type the following to install the driver:

rpm -ivh cpqhealth-2.1.0-11..i386.rpm

The RPM file may have a different version number depending on supported systems and functionality. The distribution refers to the supported by the RPM. It is very important to install the RPM only on the supported distribution. The driver will be inserted immediately. On systems with variable

5 speed fans, you may notice that the fans will start spinning more slowly if the temperature is reasonably low. To check whether the driver is loaded properly, you might want to type (only available as system admin):

lsmod

You should see an entry indication that driver "cpqhealth" was inserted.

This driver is currently supported according to the following matrix:

Health Driver Release Distribution Version Compaq Servers Version Date 1.0.0-1 08/08/2000 6.1, 6.2 ProLiant 800 SuSE Linux 6.3 ProLiant 1600 TurboLinux Server 6.0 ProLiant 1850R Caldera OpenLinux eServer 2.3 ProLiant 3000 ProLiant 5500 ProLiant 6400R ProLiant 8000 ProLiant 8500 ProLiant DL360 ProLiant DL380 ProLiant DL580 ProLiant ML330 ProLiant ML350 ProLiant ML370 ProLiant ML530 ProLiant ML570 1.1.0-2 11/03/2000 Red Hat Linux 6.2, 7.0 ProLiant DL360 SuSE Linux 6.3, 7.0 ProLiant DL380 TurboLinux Server 6.0, 6.0.5 ProLiant DL580 Caldera OpenLinux eServer 2.3 ProLiant ML330 ProLiant ML350 1 GHz ProLiant ML370 ProLiant ML530 ProLiant ML570 1.1.1-1 11/15/2000 Red Hat Linux 6.2, 7.0 ProLiant DL320 SuSE Linux 6.3, 7.0 ProLiant DL360 TurboLinux Server 6.0, 6.0.5 ProLiant DL380 Caldera OpenLinux eServer 2.3 ProLiant DL580 ProLiant ML330 ProLiant ML350 1 GHz ProLiant ML370 ProLiant ML530 ProLiant ML570 1.2.0-1 12/14/2000 Red Hat Linux 6.2, 7.0 ProLiant 8000 SuSE Linux 6.3, 7.0 ProLiant 8500 TurboLinux Server 6.0.5 ProLiant DL320 Caldera OpenLinux eServer 2.3.1 ProLiant DL360 ProLiant DL380 ProLiant DL580 ProLiant ML330 ProLiant ML350 1 GHz ProLiant ML370 ProLiant ML530 ProLiant ML570

6 2.0.0-11 04/02/2001 Red Hat Linux 6.2, 7.0 ProLiant DL320 SuSE Linux 6.3, 7.0 ProLiant DL360 TurboLinux Server 6.0.5 ProLiant DL380 Caldera OpenLinux eServer 2.3.1 ProLiant DL580 ProLiant ML330 ProLiant ML350 1 GHz ProLiant ML370 ProLiant ML530 ProLiant ML570 Red Hat Linux 7.0 ProLiant 8000 SuSE Linux 7.0 ProLiant 8500 Caldera OpenLinux eServer 2.3.1 ProLiant DL760 ProLiant ML330e ProLiant ML750 2.1.0-11 05/31/2001 Red Hat Linux 6.2, 7.0, 7.1 ProLiant DL320 SuSE Linux 7.0 ProLiant DL360 Caldera OpenLinux eServer 3.1 ProLiant DL380 ProLiant DL580 ProLiant ML330 ProLiant ML350 1 GHz ProLiant ML370 ProLiant ML530 ProLiant ML570 Red Hat Linux 7.0, 7.1 ProLiant 8000 SuSE Linux 7.0 ProLiant 8500 Caldera OpenLinux eServer 3.1 ProLiant DL760 ProLiant ML330e ProLiant ML750

On any other machine, you will get an error message when you attempt to install the package. The driver will not be operational and it is advisable to uninstall the driver at your earliest convenience.

2.2 Upgrading the Driver

The Red Hat Package Manager provides the option to upgrade an RPM package. Before upgrading, it is important to uninstall any RPM packages that are dependent on the health driver, such as the Compaq Management Agents and the Compaq Remote Insight Driver, since these packages are dependent on a specific health driver version. Attempting to install these packages on an unsupported health driver version may result in an unstable system. Type the following, in order, to uninstall any of these packages if they are present on your system:

rpm -e cmanic rpm -e cmastor rpm -e cmasvr rpm -e cmafdtn rpm -e cpqrid

To upgrade the health driver, type the following command:

rpm -Uvh cpqhealth-2.1.0-11..i386.rpm

Please note that if the upgrade option is used, the health driver will be stopped after installation to preserve system stability. Please upgrade any components dependent on the Compaq Health Driver (cpqrid, cmafdtn, cmasvr, cmanic, cmastor).

7 To start the Health driver, type the following commands: For Redhat, Caldera, TurboLinux: /etc/rc.d/init.d/cpqhealth start For SuSE: /etc/rc.d/cpqhealth start

2.3 Running the Driver

You will notice that once installed, the driver will be automatically loaded every time your server boots up.

Several /proc entries are available when the driver is running. They are:

• /proc/cpqtmp: Temperature data

• /proc/cpqfan: Fan data

• /proc/cpqpow: Power supply data

• /proc/cpqiml: Integrated Management Log data

• /proc/cpqnvr: Miscellaneous NVRAM data

The contents of the /proc entries are described in section 4.

For additional information and help, a man page is available by typing:

man cpqhealth

2.4 Uninstall

Uninstall is according to the RPM standard and is achieved by typing:

rpm -e cpqhealth

If the health driver is running, it will be shut down. Should you reboot the system, the health driver will NOT be inserted at bootup time.

If you do not recall the version of the health driver installed, the following command may be used to discover the package version:

rpm -q cpqhealth

If you ever want to unload the driver, simply type (as system admin):

rmmod cpqhealth

The health driver will be removed from your system. Should an error condition occur, the driver will log an entry to the system log and to the health log as well as to the (text) console. In case of an emergency, the health driver will attempt to shut your system down gracefully. Using the rmmod command will not prevent the driver from being inserted at bootup time.

8 2.5 Behind the Scenes

A prototype of the driver is inserted in /lib/modules/Compaq/drivers/. Furthermore, a copy of the driver is landed in /lib/modules//misc. This allows the insertion of the health driver by hand from anywhere in the file system.

The health driver exposes the following device nodes that are used to control its operation. These character device nodes all have a major number of 207, and the minor numbers are as follows:

0 = /dev/cpqhealth/cpqw Redirector interface 1 = /dev/cpqhealth/crom EISA CROM 2 = /dev/cpqhealth/cdt Data Table 3 = /dev/cpqhealth/cevt Event Log 4 = /dev/cpqhealth/casr Automatic Server Recovery 5 = /dev/cpqhealth/cecc ECC Memory 6 = /dev/cpqhealth/cmca Machine Check Architecture 7 = /dev/cpqhealth/ccsm Deprecated CDT 8 = /dev/cpqhealth/cnmi NMI Handling 9 = /dev/cpqhealth/css Sideshow Management 10 = /dev/cpqhealth/cram CMOS interface 11 = /dev/cpqhealth/cpci PCI IRQ interface

In order to insert the driver at bootup time, the rc.local script is modified during the install process.

3 Console Messages

When events occur outside of normal operations, the health driver may display a console message. The following is a list of console messages the health driver provides as it monitors system health.

View the IML (the IML is described more fully in Section 4) to identify where the fault lies when failures are reported, and take the appropriate action.

3.1 Memory

• A memory module has exceeded its threshold of correctable errors. Monitoring of ECC errors has been turned off.

If a memory module fails, view the IML to identify the faulty memory module. Plan for maintenance downtime and replace the module.

3.2 Thermal Sensors (Temperature)

• Approaching Dangerous Temperature. The _ Thermal Sensor (#_) is reporting overheating conditions.

• A dangerous temperature condition has been detected by a _ Thermal Sensor (#_).

• Normal conditions have returned to a Thermal Sensor (#_) in the _ group.

9 If the temperature exceeds the acceptable threshold, ensure that all system fans are functional and that airflow to all system vents is not obstructed. Check room temperature and make sure air conditioning is not turned off at night.

3.3 Fans

• A redundant fan (fan #_) in the _ group has failed.

• A redundant fan (fan #_) in the _ group has returned to normal.

• A fan (fan #_) in the _ group has failed.

• Fan _ located in _ has returned to normal operation.

• A required system fan (fan #_) in the _ group has failed.

• A required system fan (fan #_) in the _ group has returned to normal operation.

• Non-critical Thermal Failure System fan _ has failed.

• Non-critical Thermal Failure System fan _ has returned to normal.

• A Critical fan (fan #_) located in the _ has failed.

• The system of fans, located in the _ area, is no longer redundant.

• The system of fans, located in the _ area, is now redundant.

• Fan _ located in _ has been inserted.

• Fan _ located in _ has been removed.

If a critical fan has failed, replace the specified fan immediately, even if the fan appears functional (spinning). If a redundant fan has failed, replace the fan during scheduled maintenance.

3.4 Power Supplies

• A Power Supply (Power Supply #_) in the _ group is not providing power.

• A redundant Power Supply (Power Supply #_) in the _ group has returned to normal.

• Power Supply system located in _ is no longer redundant.

• Power Supply system located in _ is now redundant.

• Power Supply _ located in _ has been inserted.

• Power Supply _ located in _ has been removed.

10 Check the status and connections on all power supplies when failures are reported. If a power supply has failed, replace the specified power supply.

3.5 Processor Power Modules

• A Processor Power Module (#_) has failed (slot _, socket _). The system will continue to operate.

• A Processor Power Module (#_) located in (slot _, socket _) has returned to normal operation.

• Processor Power Module sub-system located in (slot _, socket _) is no longer redundant.

• Processor Power Module sub-system located in (slot _, socket _) is now redundant.

• Processor Power Module _ located in (slot _, socket _) has been inserted.

• Processor Power Module _ located in (slot _, socket _) has been removed.

If a processor power module has failed, replace the specified processor power module.

4 Information Retrieval

Fans, Power Supply and Temperature information are available through the health driver's /proc entries. Internally, the driver actually has many more information items. We will expose them over time and probably create a special subdirectory for all Compaq related entries.

Currently, all information is summarized through tables. The table rows represent an instance of hardware device (temperature sensor, fan, power supply). The table columns have attributes that are as follows:

4.1 Temperature

1. Instance Number of the temperature sensor

2. Type of sensor.

3. Over Threshold? (0 = no, 1 = yes)

4. Data Available? (0 = no, 1 = yes)

5. Current Temperature Valid? (0 = no, 1 = yes)

6. Current Temperature in degrees Celsius

7. Threshold Temperature Valid? (0 = no, 1 = yes)

8. Threshold Temperature

11 4.2 Fan

1. Instance Number

2. Type of Fan

3. Location Designator

4. Speed State of Fan

5. Redundant Partner (Instance Number or 0 if not applicable)

6. Redundant Fan? (0 = no, 1 = yes)

7. Is Primary Fan? (0 = no, 1 = yes)

8. Is Hot Pluggable Fan? (0 = no, 1 = yes)

4.3 Power Supply

1. Instance Number

2. Type of Power Supply

3. Number of Ratings (can be 0)

4. Number of Channels (can be 0)

5. Number of Temperature Sensors (can be 0)

6. Number of Fans (can be 0)

7. A number of rows describing each rating's attributes (see below)

8. A number of rows describing each channel's attributes

9. A number of rows describing each temperature sensor's attributes

10. A number of rows describing each fan's attributes

The rating is the standard specification for the power supply. Here are its attributes:

1. Is Data Valid? (0 = no, 1 = yes)

2. Threshold Voltage

3. Total Power Output in Watts

Each channel has the following attributes:

1. Is Data Valid? (0 = no, 1 = yes)

12 2. Current Voltage in mV

3. Current Amperes in mA

4. Current Wattage in mW

The temperature sensor's attributes are as follows:

1. Is Data Valid? (0 = no, 1 = yes)

2. Current Temperature in Celsius

3. Current Threshold Temperature in Celsius

The cooling fan's information is:

1. Is Data Valid? (0 = no, 1 = yes)

2. Type of Fan

3. Current Speed State

4.4 Integrated Management Log (IML)

The log entry is structured like this:

• Description of Event (human readable)

• Type of Event (human readable)

• Severity (Information, Repaired, Caution, Failed, Unknown)

• Count (how many times the event was observed)

• Updated Time (hh:mm MM/DD/YYYY) (last time the event took place)

• Initial Time (hh:mm MM/DD/YYYY) (first time the event took place)

• Event Number

13 5 Compaq Integrated Management Log Viewer (IML Viewer)

The information in the Integrated Management Log may also be leveraged through the IML Viewer application, which is also included in the Red Hat Package Manager file. The IML records system events, critical errors, power-on messages, memory errors, and any catastrophic hardware or software errors that typically cause a system to fail. The IML Viewer allows the manipulation of this data.

5.1 Running the IML Viewer

The IML Viewer is an application that runs in the X-Windows environment. Type the following to run the IML Viewer:

cpqimlview

The Compaq Integrated Management Log Viewer automatically displays the current entries in the IML.

Each event in the IML Viewer has one of the following statuses to identify the severity of the event:

• Informational - General information about a system event

• Repaired - An indication that this entry has been repaired

• Caution - An indication that a non-fatal error condition has occurred

• Critical - A component of the system has failed

The severity of the event and other information in the IML Viewer helps to quickly identify and correct problems, thus minimizing downtime. The IML Viewer allows several capabilities to enhance the ability to identify, correct, and document server health. The following describes the menu options available.

14 5.2 File Menu

The File Menu options include:

• Open... - open a previously saved file and display the contents in the IML Viewer

15 • Save As... - save the current entries of the IML to a file. This operation does not affect the current contents of the IML. This allows archival of IML data for input into a text editor or spreadsheet application or other IML Viewer utility. The File Name entry should specify the full path for the desired file name. If no path is specified, the file will be saved in the current directory.

• Exit - close the IML Viewer window and exit the application

5.3 Log Menu

16 The Log Menu options include:

• Clear All Entries - clear the IML. It is recommended to save the current contents into a file before emptying the log.

• Mark As Repaired - mark a specific entry as repaired.

• Add Maintenance Note - mark a specific entry with maintenance information.

17 5.4 View Menu

The View Menu options include

• Filter - filter IML events to display only desired event types

• Refresh Now - re-read and re-display entire current IML

18 • Sort Events - sort IML events by event types

6 Troubleshooting

This section describes common problems that might occur during install and operation of the Health and Wellness Driver. In most cases, a workaround is available which shall be described in the next paragraphs.

6.1 Non Certified Machines

Symptom When the Health and Wellness Driver RPM file is installed you will get the following message:

Inserting Health & Wellness Driver... This Compaq Health & Wellness Driver is not certified for your system. Uninstall this package at your earliest convenience.

The driver is not inserted into the list of modules. When trying to force the insertion of the driver in /lib/module/Compaq/drivers with insmod, the following message is output:

cpqhealth.o: init_module: Device or resource busy

Cause The Linux Compaq Health and Wellness Driver is only certified for a subset of systems that Compaq offers. The driver is deactivated for all other hardware and will not function by design.

Workaround There is no workaround for this.

6.2 Health Driver Immediately Stops after Installation

Symptom When the Health and Wellness Driver RPM file is installed you will get the following message:

If you are using the upgrade option of the RPM package, the Compaq Health Driver will be stopped to prevent an unstable system. Please upgrade any components dependent on the Compaq Health Driver (cmafdtn, cmasvr, cmanic, cmastor, cpqrid) To restart the Compaq Health Driver, type "/etc/rc.d/init.d/cpqhealth start"

19 Cause The Compaq Management Agents are only certified for a specific version of the Linux Compaq Health and Wellness Driver. Since this is the case, using the upgrade option instead of the install option can invite certain problems in the interaction between the agents and the health driver that may result in an unstable system. To prevent this, the health driver will be stopped after an upgrade to provide the system administrator with the opportunity to make sure the system has no dependencies that may cause problems.

Workaround Use the rpm -ivh command instead of the rpm -Uvh command to install the health driver. Uninstall any dependent packages:

rpm -e cmanic rpm -e cmastor rpm -e cmasvr rpm -e cmafdtn rpm -e cpqrid rpm -e cpqhealth rpm -ivh cpqhealth-2.1.0-11..i386.rpm

6.3 No Console Messages

Symptom If you run a SuSE distribution, you will see no console messages appearing on the text screens (Ctrl+Alt+F1, for instance). However, the error messages do get logged properly in /var/log/messages.

If you run KDE or Gnome, xterms will also not show the console messages originating from the health driver.

Cause SuSE configures the syslogd daemon slightly differently than other distributions: The system messages will not appear on the lower digit terminals (tty1-9), but will exclusively appear on tty10 (Ctrl+Alt+F10).

Workaround If you are not happy with the message logging on your system, you may configure it differently by modifying /etc/syslogd.conf in the following way:

# Log all kernel messages to the console. # Logging much else clutters up the screen. kern.* /dev/console # Log anything (except mail) of level info or higher. # Don't log private authentication messages! *.info;mail.none;news.none;authpriv.none /var/log/messages

After sending a "HUP" signal to syslogd process ID, you should now see your kernel messages appearing on all consoles.

kill -1

6.4 Failed Dependencies

Symptom If you insert the Health and Wellness driver on a system with a minimal Linux install, you might encounter the following message:

error: failed dependencies: egcs >= 1.1.2 is needed by cpqhealth-2.1.0-11 /bin/objcopy is needed by cpqhealth-2.1.0-11

20 After downloading the ecgs RPM file, and trying to install it, similar error messages will result in ecgs' failure to install.

Cause The health and wellness driver depends on two (fairly basic) Linux applications being present in the system: gcc, the C compiler, and objcopy, a binary utility to fix up object modules. The former is contained in a package called egcs, the latter in an RPM named binutils. Unfortunately, these RPM files have dependencies on other packages which in turn have their own dependencies, etc. The Red Hat Package Manager is designed to detect failed dependencies which have to be met step by step. This can be frustrating to the user.

Workaround We list all packages that both ecgs and gcc are dependent on. You must install the packages in the following order to prevent failed dependencies.

1. ld-config 2. glibc 3. info-install 4. readline 5. termcap 6. libtermcap 7. bash 8. binutils 9. gcc-cpp 10. kernel-headers 11. glibc-devel 12. make 13. ecgs

In order to install these packages we recommend the following procedure:

• Download the packages mentioned in the above list from your distribution ftp site (ftp.redhat.com/pub, for instance). Keep in mind that they will have a version number suffix, such as ecgs-1.1.2-25.i386.rpm.

• Alternatively, navigate into the RPMS folder in your distribution CD (mount /dev/cdroom; cd /mnt/cdrom/Redhat/RPMS).

• Open a terminal (Ctrl+Alt+F1 or new xterm).

• Now, type in rpm -ivh ld-config-* (the wildcard is necessary, since the actual version numbers differ from distribution to distribution). You may also use tab completion (pressing tab after the package name).

• Work down the list you see above. You will either get a message informing you that this package has already been installed (likely contestants for this to occur are glibc, bash and termcap), or the package will unpack without problems on your system (binutils, glibc-devl, make).

21 6.5 Failure in cpqimlview

Symptom When starting cpqimlview, the IML Viewer, you will see the following message:

The IML is not functioning after this error message appears.

Cause The problem lies in the fact that the health driver is not inserted on your system. This, for instance, could have happened, when cpqimlview was used while the Health and Wellness Driver package was uninstalled. Another reason is that your system is not certified for the current version of the health driver.

Workaround You may want to try to insert the driver manually by typing:

modprobe cpqhealth.o in a console window. This will insert the health driver (verify by typing 'lsmod'). If that is not working, then your system is most likely not certified for the health driver.

6.6 Superuser Only

Symptom You will experience the following problems:

• Commands like insmod, modprobe, rmmod, or rpm are not available.

• The rpm install will fail because of file permissions being denied.

failed to open //var/lib/rpm/packages.rpm error: cannot open //var/lib/rpm/packages.rpm

• The command cpqimlview is not known or fails because of file permissions.

Cause Preparing a driver install necessitates access to system administrator rights.

Workaround Be sure to log in as root before you attempt the driver install.

22